MAY 1 1956 
0. CANADIAN 
OURNAL OF MATHEMATICS 


Journal Canadien de Mathématiques 


VOL. II - NO. 2 
1950 


Generalized Hamiltonian dynamics P. A. M. Dirac 


Periodic linear transformations of 
affine and projective geometries Ernst Snapper 


Elementarfunktionen auf 
Riemannschen Flachen Behnke und Stein 


The degrees of the irreducible 
representations of a group Hans Zassenhaus 


Identities and congruences of the 
Ramanujan type K. G. Ramanathan 


A class of self-dual maps Smith and Tutte 
Squaring the square W. T. Tutte 
Water waves over a channel of finite depth A. E. Heins 


Functions satisfying a certain 
integral equation F. M. Goodspeed 


Minimal solutions of Diophantine equations Ludwig Holzer 


Convex domains and Tchebycheff’s 
approximation problem Rademacher and Schoenberg 


Published for 
THE CANADIAN MATHEMATICAL CONGRESS 
by the 


University of Toronto Press 





EDITORIAL BOARD 


H. S. M. Coxeter, A. Gauthier, L. Infeld, R. D. James, R. L. Jeffery, 
G. de B. Robinson 


with the co-operation of 


R. Brauer, J.Chapelon, D.B.DeLury, P. Dubreil, I. Halperin, 
W. V. D. Hodge, S. MacLane, L. J. Mordell, G. Pall, J. L. Synge, 
A. W. Tucker, W. J. Webber 


The chief languages of the Journal are English and French. 


Manuscripts for publication in the Journal should be sent to the 
Editor-in-Chief, H. S. M. Coxeter, University of Toronto. Every paper 
should contain an introduction summarizing the results as far as possible 
in such a way as to be understood by the non-expert. 


All other correspondence should be addressed to the Managing 
Editor, G. de B. Robinson, University of Toronto. 


The Journal is published quarterly. Subscriptions should be sent 
to the Managing Editor. The price per volume of four numbers is 
$6.00. This is reduced to $3.00 for individuals who are members of 
the following Societies: 


Canadian Mathematical Congress 
American Mathematical Society 
Mathematical Association of America 
London Mathematical Society 

Société Mathématique de France 


The Canadian Mathematical Congress gratefully acknowledges the 
assistance of the following towards the cost of publishing this Journal: 


University of British Columbia 
Carleton College Ecole Polytechnique 
Loyola College University of Manitoba 
McGill University McMaster University 
Universite de Montreal Queen’s University 
Royal Military College University of Toronto 

National Research Council 
and the 
American Mathematical Society 


AUTHORIZED AS SECOND CLASS MAIL, POST OFFICE DEPARTMENT, OTTAWA 








GENERALIZED HAMILTONIAN DYNAMICS 


P. A. M. DIRAC 


1. Introduction. The equations of dynamics were put into a general form 
by Lagrange, who expressed them in terms of a set of generalized coordinates 
and velocities. An alternative general form was later given by Hamilton, in 
terms of coordinates and momenta. Let us consider the relative merits of the 
two forms. 

With the Lagrangian form the requirements of special relativity can very 
easily be satisfied, simply by taking the action, i.e. the time integral of the 
Lagrangian, to be Lorentz invariant. There is no such simple way of making 
the Hamiltonian form relativistic. 

For the purpose of setting up a quantum theory one must work from the 
Hamiltonian form. There are well-established rules for passing from Hamilton's 
dynamics to quantum dynamics, by making the coordinates and momenta 
into linear operators. The rules lead to definite results in simple cases and, 
although they cannot be applied to complicated examples without ambiguity, 
they have proved to be adequate for practical purposes. 

Thus both forms have their special values at the present time and one must 
work with both. The two forms are closely connected. Starting with any 
Lagrangian one can introduce the momenta and, in the case when the momenta 
are independent functions of the velocities, one can obtain the Hamiltonian. 
The present paper is concerned with setting up a more general theory which 
can be applied also when the momenta are not independent functions of the 
velocities. A more general form of Hamiltonian dynamics is obtained, which 
can still be used for the purpose of quantization, and which turns out to be 
specially well suited for a relativistic description of dynamical processes. 


2. Strong and weak equations. We considera dynamical system of N degrees 
of freedom, described in terms of generalized coordinates g,(m = 1,2,...,N) 
_ and velocities dg,/dt or gn. We assume a Lagrangian L, which for the present 
can be any function of the coordinates and velocities 


(1) L = LG, q). 


We define the momenta by 
(2) Pn = OL /0gn. 

For the development of the theory we introduce a variation procedure, 
varying each of the quantities gn, ga, P» independently by a small quantity 
8qn, 5Qn, 5P, Of order « and working to the accuracy of «. As a result of this 


This paper is based on the first half of a course of lectures given at the Canadian Math- 
ematical Seminar in Vancouver in August 1949. 


129 











130 P. A. M. DIRAC 


variation procedure equation (2) will get violated, as its left-hand side will be 
made to differ from its right-hand side by a quantity of order e. We shall now 
have to distinguish between two kinds of equations, equations such as (2) 
which get violated by a quantity of order « when we apply the variation, and 
equations which remain valid to the accuracy ¢ under the variation. Equation 
(1) will be of the latter kind, since the variation in L will equal, by definition, 
the variation of the function L(g,g)._ The former kind of equation we shall call 
a weak equation and write with the usual equality sign =, the latter we shall 
call a strong equation and write with the sign =. 

We have the following rules governing algebraic work with weak and strong 
equations: 

if A=O then 6A =0; 
if X=0 then 8X #0; 
in general. From the weak equation X = 0 we can deduce 
5X? = 2X5X = 0, 
so we can deduce the strong equation 
X*? = 0. 
Similarly, from two weak equations X, = 0 and X; = 0 we can deduce the 
strong equation 
Pe 1X: ; = 0. 

It may be that the N quantities 0L/dg, on the right-hand side of (2) are all 
independent functions of the N velocities g,. In this case equations (2) deter- 
mine each g as a function of the g’s and ~’s. This case will be referred to as 
the standard case, and is the only one usually considered in dynamical theory. 

If the dL /dq’s are not independent functions of the velocities, we can elimin- 
ate the g’s from equations (2) and obtain one or more equations 
(3) o(g, p) = 0 
involving only g's and p’s. We may suppose equation (3) to be written in 
such a way that the variation procedure changes ¢ by a quantity of order «, 
since if it changes ¢ by a quantity of order e*, we have only to replace ¢ by 
¢'/* in (3) and the desired condition will be fulfilled. We now have equation (3) 
violated by the order e when we apply the variation, so it is correctly written 
as a weak equation. 

We shall need to use a complete set of independent equations of the type (3), 
say 
(4) on(q, P) = 0, m=1,2,...,M. 


The condition of independence means that none of the ¢’s is expressible 
linearly in terms of the others, with functions of the g's and p’s as coefficients. 
The condition of completeness means that any function of the g’s and p's 
which vanishes on account of equations (2) and changes by the order ¢ with 
the variation procedure is expressible as a linear function of the ¢,, with 
functions of the g’s and ?’s as coefficients. 





oe one @ «4 


‘y 











GENERALIZED HAMILTONIAN DYNAMICS 131 


We may picture the relationship of strong and weak equations in the fol- 
lowing way. Take the 3n dimensional space with the q’s, g's and p’s as co- 
ordinates. In this space there will be a 2n dimensional region in which 
equations (2) are satisfied. Call it the region R. Equations (4) will also be 
satisfied in this region, as they are consequences of (2). Now consider all 
points of the 3n dimensional space which are within a distance of order 
from R. They will form a 3n dimensional region like a shell with a thickness 
of order e. Call this the region R,. A weak equation holds in the region R, 
a strong equation holds in the region R,. 


3. The Hamiltonian. The Hamiltonian H is defined by 
(5) H = png, — L, 
where a summation is understood over all values for a repeated suffix in a 
term. We have 

6H = 5(Padn —L) 
Pndgn + Gndfn — OL/AGn.bGn — OL/8Gn.bGn 
We find that 6H does not depend on the éq’s. This important result holds 
whether we have the standard case or not. 

Equation (5) gives a definition for H as a function of the q's, g's and ?’s, 
holding throughout the 3n dimensional space of q's, g's, and p’s. We shall 
use the definition only in the region R,, and in this region the result (6) 
holds, to the first order. This means that, if we keep the g's and p’s con- 
stant and make a first-order variation in the @’s, the variation in H will be 
of the second order. Thus if we keep the g's and p’s constant and make a 
finite variation in the g’s, keeping all the time in the region R, (which is 
possible when we do not have the standard case), the variation in H will be 
of the first order. If we keep in the region R, the variation in H will be zero. 
It follows that in the region R, H is a function of the g’s and p’s only. Calling 
this function $(g, p), we have the weak equation 


(7) H = $(q, p) 
holding in the region R. In the standard case the function § is the ordinary 
Hamiltonian. 

Starting from a point in R and making a general variation, we have from 

, aD aL , 89 

6) HH - $) = (a o ) ap, (2? n 2) ba. 
Thus 6(H — §) depends only on the dg’s and 6p’s. If the variation is such 
that we stay in the region R, then of course 6(H — $) = 0. Thus 6(H — §$) 
vanishes for any variation of the g’s and p’s such that one can choose the 4g’s 
so as to preserve equations (2). The only restriction this imposes on the dg’s 
and 5p’s is that they must preserve equations (4), i.e. they must lead to 
6g, = 0 for all m. Thus 6(H — §) is zero for any values 5g, 6p that make 
5¢,, = 0, and hence for arbitrary 5g, 5p 














132 P. A. M. DIRAC 


(8) 5(H — 9) = Imddm 


with suitable coefficients v,,. These coefficients will be functions of the q's, 
q's and p’s, and with the help of (2) can be expressed as functions of the 
q's and g's only. We now get 


5(H we H- Umdm) = 5(H ic 9) ei VmOdm se: $m5Um = 0 
from (8) and (4), and hence 
(9) H = 9 + Umdm. 


We have here a strong equation, holding to the first order in the region R,, 
in contradistinction to the weak equation (7), which holds only in R. 
Equation (8) gives 


6H = 6D + vmidm 


»=Ban.+2an te (2% bp, + 20m ian) 
OPn Odn qd 


Comparing this with (6), we get 

















: rs] Om 
(10) a, =f +o, SS, 
OPn OPn 
(11) -£2.84,%. 
Odn dn Ogn 


Equations (10) give the q’s in terms of the q’s, p’s and v's. They show 
that the 2N variables q,,g, can be expressed in terms of the 2N + M variables 
dn; Pn» Um. Between these 2N + M variables there exist the m relations (4). 
There cannot be any other relations between these variables, as otherwise 
the 2N variables gn, G2 would not be independent. Thus the v’s must each 
be independent of the g’s, p's and other v’s. The v’s can be considered as a 
kind of velocity variables, which serve to fix those g’s that cannot be ex- 
pressed in terms of g’s and #’s. 

When we work with the Hamiltonian form of dynamics we use as basic 
variables the q’s, p’s and v’s, between which certain relations (4) are assumed 
to exist, and which are otherwise independent. These variables will be called 
the Hamiltonian variables. 


4. The equations of motion. We assume the usual Lagrangian equations 
of motion as weak equations, 
(12) bn = OL/dQn. 
By substituting for the p’s in (12) their values given by (2), we get equations 
involving the accelerations g,. In the standard case these equations will 
determine all the g’s in terms of the g’s and g's. In the case with m equations 
(4), the equations of motion will give us only N — mM equations for the q’s. 
The remaining M equations of motion will tell us how the ¢,,'s vary with time. 











he 


ons 














GENERALIZED HAMILTONIAN DYNAMICS 133 


For consistency the ¢,,’s must remain zero. These consistency conditions 
will be examined later. 


With the help of (11) the equations of motion (12) take the form 
; 0 Odm 
(13) — re. F 
9qn Ogn 


Equations (13) together with (10) constitute the Hamiltonian equations of 
motion. They are fixed by the function § and the equations ¢,, = 0. The 
Hamiltonian equations of motion give us the g’s and #’s in terms of the 
Hamiltonian variables g, p, v. They give us no direct information about the 
v's, but will give us some information indirectly when we examine the con- 
sistency conditions. 

The Hamiltonian equations of motion can be expressed more easily with 
the help of the Poisson bracket notation. Any two functions ¢ and 9 of the 
q's and p’s have a P. b. [E, 9], defined by 


(14) Bde ss - SS. 
Ogn IPn OPn 99n 

It is easily verified that the P. b. remains invariant under a transformation 
to new q's and p’s, in which the new q’s are any independent functions of the 
original g’s and the new ’s are defined by the new equations (2) with L 
expressed in terms of the new q’s and their time derivatives. This invariance 
property gives the P. b. its importance. 

P. b.’s are subject to the following laws, which are easily verified from the 
definition: 


| n| == [n, é], 


it. few, ...)) a 


Al # 
—_ on lé, ml] + des lé, 72] + eee 


lé, [n, tI] + [n, [f, é]] + it, (é, n)] = 0. 


In the second of these laws f is any function of various quantities 9, m2, .. . , 
each of which is a function of the g’s and p’s. The third law, known as 
Poisson’s identity, applies to any three functions £, 7, ¢ of the q’s and p's. 

It is desirable to extend the notion of P. b.’s to include functions of the 
q's which are not expressible in terms of the g’s and p’s. We assume these 
more general P. b.’s are subject to the laws (15) but are otherwise arbitrary. 
Alternatively, we may assume that the g’s are arbitrary functions of the q's 
and p’s, and the laws (15) can then be deduced with £, the y's and ¢ involving 
the q’s. 

From a strong equation A = 0 we can infer the weak equations 


(15) 


and hence, by an application of the second of the laws (15), 
lg, A] = 0 











134 P. A. M. DIRAC 


for any §. We may have [é, A] = 0, (for example when A = 0 by definition) 
but this is not necessarily so. From a weak equation X = 0 we cannot infer 
[¢, X] =0 in general. 
If g is any function of the g’s and p’s, we have from (10) and (13) 
~ 2 (28 +», 2) — 24(28 4», 24) 
Odn OPn OPn OPn O4n Odn 

(16) = [g, D] + om [g, oml- 
This is the general Hamiltonian equation of motion. It may also be written, 
with the help of (4), as 
when it is the same as the usual Hamiltonian equation of motion in P. b. 
notation. 





5. Homogeneous velocities. The theory takes a specially simple form in 
the case when the Lagrangian is homogeneous of the first degree in the veloci- 
ties. The momenta defined by (2) are then homogeneous of degree zero in the 
q's and so depend only on the ratios of the g’s. Since there are N p’s and only 
N — 1 independent ratios of the g’s, the »’s now cannot be independent func- 
tions of the g’s and there must be at least one relation (4) connecting the q's 
and p’s. The case when there is only one relation between the q's and p’s may 
now be considered as the standard case. 

From Euler’s theorem we have 


(18) L = Gn OL/dGn 
and hence L = dnPn 

so that 

(19) H = 0. 


This weak equation holding in the region R allows us to take $ = 0, so that 
(9) becomes 


(20) H =0ndm- 
The general equation of motion (16) is now 
(21) g = Umlg, dm). 


The Hamiltonian equations of motion are now fixed entirely by the equations 
om = 0. 

Equation (21) is homogeneous in the v’s on the right-hand side. Given any 
solution of the equations of motion, one can obtain another solution from it 
by multiplying all the v’s by a factor y, which may vary arbitrarily with the 
time. The new solution will have the time rate of change of all dynamical 
variables multiplied by the factor y. The new solution would be obtained 
from the previous solution if we replaced the time ¢ by a new independent 
variable + such that dt/dr = y. The new independent variable is completely 
arbitrary: it can be any function of ¢ and the g’s and q’s. Thus, given any 
solution of the equations of motion, we can get another solution from it by 


—_ 


-—_ - © “| 


sa a = —_ ~ ah an oO ~~ ne —— 





GENERALIZED HAMILTONIAN DYNAMICS 135 


replacing ¢ by an arbitrary 7, so the equations of motion give us no information 
about the independent variable. This is an important feature of dynamical 
theory with homogeneous velocities, and makes it specially convenient for a 
relativistic treatment. 

The Lagrangian for any dynamical system can be made to satisfy the condition 
for homogeneous velocities by taking the time ¢ to be an extra coordinate go 
and using the equation gp = 1 to make the Lagrangian homogeneous of the first 
degree in all the velocities, including gp. The new Lagrangian equations of 
motion for all the g’s can then be deduced, as has been shown by the author [1]. 
In this way we can get a new formulation for a general dynamical system in 
terms of homogeneous velocities. The new formulation gives all the equations 
of the old formulation except the equation g9= 1. If we want to have this 
equation in the new formulation we may assume it as a supplementary condi- 
tion, not derivable from the equations of motion but consistent with them. We 
can, however, very well dispense with it, as its only effect is to fix the indepen- 
dent variable, which would otherwise be arbitrary in the homogeneous 
velocity formulation. 

Thus we may confine ourselves to the homogeneous velocity theory without 

} losing any generality. We shall do this in future as it leads to somewhat 
. simpler equations, and use the dot to denote differentiation with respect to an 








arbitrary independent variable r. 
6. The consistency conditions. For consistency the equations of motion 
; must make each of the ¢, remain zero. Thus, putting ¢. for g in (21), 
we get 
(22) Um [om, om] = Q. 
Let us suppose the equations (22) to be reduced as far as possible with the 
help of the set of equations (4). The reduction may involve the cancellation 
c of factors when we can assume these factors do not vanish. The resulting 
} equations must each be of one of four types. 
Type 1. It involves some of the variables ». 
Type 2. It is independent of the v’s but involves some of the variables p and 
q. It is thus of the form 
is 
| (23) x(q, p) = 0 
y and is independent of the equations (4). 
it Type 3. It reduces to 0 = 0. 
r Type 4. It reduces to 1 = 0. 
" 
d An equation of type 2 leads to a further consistency condition, since we 
t must have x remaining zero. Putting x for g in (21), we get 
y (24) Umidm, x] = 0. 
y This equation, reduced as far as possible with the help of equations (4) and 


y any equations (23) that we already have, will again be of one of the four types. 














136 P. A. M. DIRAC 


If it is of type 2 it will lead to yet another consistency condition. We continue 
in this way with each equation of type 2 until it leads to an equation of another 
type. 

If any of the equations obtained in this way is of type 4, the equations of 
motion are inconsistent. This case is of no interest and will be excluded in 
future. Equations of type 3 are automatically satisfied. We are left with the 
equations of types 1 and 2 to fit into the theory. 

Let us call the complete set of equations of type 2 


(25) xk(¢,p) = 0, ot a 


We may suppose the functions x; to be chosen, like the ¢,, in (4), so that their 
variations are of order «. Equations (25) are then correctly written as weak 
equations. These further weak equations will reduce the region R, in which 
all weak equations hold, so as to have only 2N — K dimensions. The region 
R, will also get reduced, as it will now consist of all points within a distance of 
order ¢ from the new region R. 

For studying the equations of type 1, it is convenient to introduce some new 
concepts. We define one of the quantities ¢,, to be a first class ¢ if its P. b. 
with every @ and x vanishes. Thus ¢,, is first class if 

[Gms Gm] = 0, m=1,2,...,M, 
wns iw, xi] = 0, bo1.2....5 
These equations need hold only in the weak sense, which means that they need 
hold only as consequences of the equations ¢, = 0, x, = 0. Thus the left- 
hand sides of (26) must each equal in the strong sense some linear function of 
the ¢, and x;,. A which does not satisfy all these conditions we call a second 
class $. 

We can make a linear transformation of the ¢’s of the form 
(27) O*m = Ymmom'; 
where the ’s are any functions of the g’s and ’s such that their determinant 
does not vanish in the weak sense. The ¢*’s are then equivalent to the ¢’s 
for all the purposes of the theory. 

Let us make a transformation of this kind so as to bring as many @¢’s as 
possible into the first class. Let us call the first class ¢’s that we then have 
@,'s and the second class ones ¢,'s, with 8 = 1,2,...B and a=B-+1, 
B+2,...M. 

If ¢w is first class, equation (22) is automatically satisfied. Further, in 
equations (22) and (24) we can restrict ¢,, to be second class, as first class ¢m’s 
contribute zero. Thus the surviving equations (22) and (24) will read 


gles, oe] = 0, B, B’ = * 7 
= Uglde, xe) = 0, k =1,2,...,K. 


These are all the equations of type 1. They show that either all the v,’s vanish 
or the matrix 





$a 























GENERALIZED HAMILTONIAN DYNAMICS 137 


0 (dr, b2] [ois] --- [Gr,¢0] [or xi) --- [ors xxl | 
[ero] 0 (be, os] --- [bads) [oe xr) --- [oe xxl 
(29) . . . “ee . . “ee . | 
[n,or] [os, >] [os,¢] see 0 [osx] ee [Os» Xx} 





is of rank less than B, in the weak sense. 
It will now be proved that the first alternative is the correct one. Assume 
the matrix (29) is of rank u < B. Form the determinant 





$1 0 [treo] (drs) --- [bre 
(30) D=|* lend] 0 Co eC 
dois [ovsudil [dvsida] [ovsda] ‘iy [ovine] | 


D is a linear function of the ¢,’s and so vanishes in the weak sense. The P. b. 
of D with any quantity f equals the sum of the determinants formed by taking 
the P. b. of each column of (30) with f. All these determinants, except the 
one formed by taking the P. b. of the first column with f, will vanish in the 
weak sense, as the elements of their first column all vanish in the weak sense. 
Thus 


tr [¢1,f] 0 [d1,02] [1,3] -++ [roy] 
lovin] [Gv4i¢:] [o41¢2) [bvirds] --- [eval 


If we take f to be any ¢,, the first column of (31) vanishes and so [D, ¢,] = 

If we take f to be any ¢g or x, the determinant (31) either has two columns 
identical and so vanishes, or it is a minor of the matrix (29) with U + 1 rows 
and columns, and vanishes because this matrix is assumed to be of rank U. 
Thus D has zero P. b. with all the ¢’s and ,’s. 

It may be that D vanishes in the strong sense on account of the co-factors 
of the elements of its first column all vanishing in the weak sense. If this is 
the case, we take a different determinant D, with its columns after the first 
one corresponding to any U of the columns of (29) and its rows corresponding 
to any U + 1 of the rows of (29). Wecan always choose such a determinant D 
so that the co-factors of the elements of its first column do not all vanish, 
from the assumption that (29) is of rank U. We get in this way a D which is 
a first class ¢ and is a linear function of the ¢,’s. This contradicts the assump- 
tion that we had previously put as many ¢’s as possible in the first class. 

We can conclude that if we have put as many ¢’s as possible in the first class, 
the v's associated with the second class q's all vanish. The Hamiltonian (20) 
then reduces to 


(32) H = 2,9¢,,; 
and the general equation of motion (21) becomes 


(33) g = v.18, oa)- 











138 P. A. M. DIRAC 


The vanishing of the v,’s together with the equations (25) ensure that the 
consistency conditions are all satisfied. The v,’s remain completely arbitrary. 
Each of them gives rise to a freedom of motion—an arbitrary function in the 
general solution of the equations of motion. In the standard case there is 
just one ¢, which is necessarily first class, and thus there is one arbitrary 
function in the general solution of the equations of motion. This is connected 
with the arbitrary character of the independent variable r. 


7. Supplementary conditions. In dealing with a particular dynamical sys- 
tem, we may wish to impose equations on the coordinates and velocities addi- 
tional to the equations of motion that follow from the Lagrangian. Such 
supplementary conditions must be introduced as further weak equations in 
the theory. 

With the help of equations (10) (with § = 0) the supplementary conditions 
can be expressed as relations between the q's, p’s and v's. They may lead to 
equations between the q’s and ’s only. Such equations must be treated as 
extra x equations, to be joined on to the set (25). They will give rise to further 
consistency conditions, which are to be handled in the same way as the pre- 
ceding ones and may lead to still more x equations. A first class ¢ must now 
be defined to have zero P. b. also with these new x's, so the number of first 
class ¢’s may be reduced by the supplementary conditions. This would cause 
a reduction in the number of freedoms of motion. 

Those of the supplementary conditions that do not give x equations will 
give conditions on the v, variables. These conditions will usually be of a more 
complicated kind than merely the vanishing of certain v’s, like all the conditions 
on the v’s that follow from consistency conditions. They will make a further 
reduction in the number of freedoms of motion, reducing it to less than the 
number of first class ¢’s. 


8. Transformations of the Hamiltonian form. Take a set of functions 
6,(s = 1, 2,...S) of the g’s and f’s such that the determinant 


0 (61, 62] [@:, 03) ... (1, 65] 
(a) ee ee ee OO 
(0, 61) [0,6] (0,6) ... 0 


does not vanish in the weak sense. This implies that s must be even. Let 
C.» denote the co-factor of [@,, @,-] divided by A, so that 


Cee = — Co's 
and 


(35) Coel8s, 6] = bye . 


Then we can define a new P. b. [£, »]* for any two quantities — and by the 
formula 


(36) [E, nl* = [E, m] + [E, 04] cow (Ov, 2). 











et 


he 








GENERALIZED HAMILTONIAN DYNAMICS 





139 


It is easily seen that the new P. b.’s obey the first two of the laws (15), and 
after some calculation one finds that they also obey the third, Poisson's iden- 
tity. (See Appendix.) The new P. b.’s make 


lé, 6,)* = lé, 6.) + lé, 6) Co e|Oy7 0.) 
[&, 0.) — LE, Oe) dv. 
0 


(37) 
for any &. 

To understand the significance of the new P. b.’s, let us take the case when 
the 6’s consist of $s coordinates g and their conjugate p’s. We then see that 
the new P. b.’s are obtained by omitting the terms involving differentiations 
with respect to these q’s and p’s from the summation over m in the definition 
(14). Thus the new P. b.’s refer to a system of N — $s degrees of freedom. 
If, instead of taking the 6’s to be just certain g's and p’s, we take them to be 
any independent functions of these g's and p's, we get the same new P. b.’s. 
With general 6’s the new P. b.’s will still refer to a system of N — 4s degrees 
of freedom, but the reduction of the degrees of freedom is made in a more 
complicated way than the mere omission of certain q's and p's. 

Let us suppose the 6’s are all ¢’s or x’s. (The ¢’s must be second class, as 
otherwise A = 0.) We then have [0,, H] = 0 for all s, and hence 
(38) lg, H)* = (g,H) =¢ 
for g any function of the g's and p’s. Thus the new P. b.'s may be used to give 
the Hamiltonian equations of motion. We get in this way a new form for the 
equations of motion, which is simpler because the number of effective degrees 
of freedom is reduced. 

Each of the 6’s now vanishes in the weak sense. If we work only with the 
new P. b.’s, we can assume each of the @’s vanishes in the strong sense without 
getting a contradiction, because from (37) the new P. b. of a @ with anything 
vanishes. We can then use the strong equations @, = 0 to simplify the 
Hamiltonian. 

Let us define a x to be first class it if has zero P. b. with all the ¢’s and x’s 
and to be second class otherwise. We can make a linear transformation of the 
x's of the form 


(39) x" = YeeXK + 1’ km dm, 

where the y's and y’’s are any functions of the g’s and j’s such that the deter- 
minant of the y’s does not vanish in the weak sense, and the new x’s are then 
equivalent to the old ones for all the purposes of the theory. Let us make a 
transformation of this kind so as to bring as many ,x’s as possible into the first 
class, and let us call the first class x’s that we then have x,’s and the second 
class ones x¢’s. 

We may take the 6's to consist of all the ¢,'s and xg’s. The determinant 
A then does not vanish. The proof of this result is similar to the proof that 
the matrix (29) is of rank B, and consists in assuming that A is of rank T < s 
and constructing a determinant like 








140 P. A. M. DIRAC 


A, 0 [@;, 2] ie [6:, 6,] 
(40) 5 [A2, 01) 0 cee [@2, 0,] ; 
Or41  [Or42,0:) [Or41, 02) ~~. (Oras, Oy] | 


which is then seen to be a first class ¢ or x and is a linear function of the ¢,'s 
and x,’s, so it contradicts the assumption that as many ¢’s and x's as possible 
have been put in the first class. 

With this choice of @’s we get the maximum simplification of the Hamiltonian 
equations of motion by this method. We get a new scheme in which all the 
%s and xg equations are strong equations. We may be able to use these 
equations to eliminate some of the g's and p’s entirely from the theory. 

The form of the new scheme is not unique, because the ¢,’s and x,'s are not 
unique. If we merely replace the ¢,’s and x,'s by linear functions of them- 
selves, we do not change the final form. We can, however, add to the ¢,’s 
any linear functions of the ¢,’s, and to the x,’s any linear functions of the 
@,'s and x,’s, which does not change A or the c,,,, but does in general change 
[é, n]*, and so the form of the Hamiltonian scheme is altered. The different 
forms must, of course, be equivalent, as they all give the same equations of 
motion. 

As an application of the above method, let us consider the case of a Lagrang- 
ian that does not involve some of the velocities. Suppose L does not involve 
gq; = 1,2,...,J<N). Then each p; equals zero in the weak sense and equals 
a ¢ in the strong sense. Suppose that no linear combination of the p;'s is first 
class. Then we can take the p;’s to be ¢,’s. Let us now take half the @’s to 
be the p;’s and the other half to be suitable second class ¢’s or x’s so that A 
does not vanish. Call these other 6@’s @;. With this choice of 6’s one easily 
sees that the new P. b.’s are just what one would get if one applied the defini- 
tion (14) to those degrees of freedom for which g is not a g;, with each p; 
reckoned as strongly equal to zero and each gq; reckoned as strongly equal to 
a function of the other g’s and p’s given by the equations 6; = 0. We get in 
this way a new Hamiltonian scheme (not necessarily with the maximum 
simplification, as there may be other ¢,’s and xg’s not included in the 6's) 
in which the g; and p; do not appear as independent dynamical variables. 

The new scheme could be obtained in a more direct way by not counting 
the g;’s as coordinates right from the beginning, and not introducing momenta 
conjugate to them at all. Let us see what modifications this would bring into 
the development of the theory. 

Define m so as to take on only those values for which g is not a qj, i.e. the 
values J + 1,) +2,...,N. Then equations (2) and (5) still hold and equation 
(6) must be replaced by 
(41) 6H = GnoPn re OL/dgn.5gn a dL /0q;.5q;, 
as we allow the g;’s to vary. We may assume the equations 
(42) dL/dq; = 0 








GENERALIZED HAMILTONIAN DYNAMICS 141 


as supplementary conditions with this method. Equation (41) then reduces to 
precisely (6). We can infer that H is of the form (20), where the ¢,, are func- 
tions of the g,’s and p,'s, independent of the q;'s, that vanish on account of 
equations (2). The remainder of the theory can be developed as before, in 
terms of ¢’s and x's that do not involve the q;’s . Those of the ¢ or x equations 
that do involve the q,’s can be looked upon as defining the q;'s in terms of 
the other variables, and play no further role in the theory. 

With this form of the theory we have the Lagrangian containing variables 
q; that involve momenta. The appearance of momentum variables in the La- 
grangian is analogous to the appearance of the velocity variables v, in the 
Hamiltonian. 


9. The Hamiltonian as starting point. Instead of starting with the La- 
grangian and obtaining the Hamiltonian from it, one can start with the Hamil- 
tonian. One begins by assuming certain dynamical variables g, and p, (n = 1, 
2,...,N), or maybe other dynamical variables between which there are definite 
P. b. relations satisfying the laws (15), and assuming certain weak equations as 
¢@ equations connecting them. There is no point in distinguishing ¢’s and x’s 
with this method. At least one of the ¢’s must be first class, i.e. must have 
zero P. b. with all the ¢’s, or there can be no consistent motion. One then 
assumes the Hamiltonian to be a linear function of the first class ¢’s ¢,, with 
new variables v, as coefficients, and assumes the Hamiitonian equations of 
motion (17) or (33). The v’s can vary arbitrarily with the independent 
variable r. 

The previous scheme of equations of motion, derived from a Lagrangian and 
involving possibly x's as well as ¢’s, is to be looked upon as an example of the 
present scheme with some of the v's restricted to be zero by supplementary 
conditions. The ¢,’s corresponding to these v,’s are then the first class x's of 
the previous scheme. Such supplementary conditions, or any supplementary 
conditions involving the v's, are of no value for the application of the theory to 
relativistic dynamics given in the next section and cannot be taken over into 
the quantum theory, so they will not be included in the further work. Supple- 
mentary conditions not involving the v’s are just @ equations. 

The P. b. of two first class $'s is a first class $, as may be verified in the follow- 
ing way. The P. b. [¢,, ¢,] vanishes weakly and so is strongly equal to a 
linear function of the ¢’s, these being the only quantities that are weakly zero 
in the present scheme. We have to show that its P. b. with an arbitrary ¢ is 
weakly zero. From Poisson’s identity 


(43) [¢, [Pe I] = [I¢, de), >| > af (io, oe’); ¢.). 
Since ¢, is first class, [¢, ¢,] vanishes weakly and so is strongly equal to a linear 
function of the ¢’s, and hence its P. b. with the first class ¢, vanishes weakly. 


Similarly the second term on the right-hand side of (43) vanishes weakly, 
so the required result is proved. 











142 P. A. M. DIRAC 


Suppose there are A independent first class ¢’s and Mm independent ¢’s 
altogether. In the phase space (the 2n-dimensional space of the g, and p, 
variables) there is a space of (2n — mM) dimensions in which all the ¢ equations 
are satisfied. Call it the (2NV — M)-space. The state of the dynamical system 
for a particular 7 value is fixed by giving values to the q's and ’s satisfying 
all the ¢equations and is thus represented by a point P in the (2N — M)-space. 
The motion of the system ensuing from this state is represented by a curve in 
the (2N — M)-space starting from P. On account of the a variables v, being 
arbitrary, this curve may start out in any direction in a small space of A 
dimensions surrounding P. There is one of these small spaces of A dimensions 
surrounding every point of the (2V—M)-space. It will now be shown that 
these small spaces are integrable. 

Suppose that for an interval of 7, r = «, all the v's vanish except v,,, which 
is equal to 1, and that for the following + interval, r = «, all the v’s vanish 
except v,”, which is equal to 1. Then any function g of the g’s and p’s is changed 
at the end of the first interval to 


g + alg, oq’. 


It is changed at the end of the second interval, with the accuracy «« but with 
neglect of ¢,? and ¢,”, to 


(44) g + alg, a] + alg + alg, ou], 4”). 


If the two kinds of motion are made in the reverse order, g changes to 


(45) g + elg, | + alg + elg, $2”), ou). 
The difference between (44) and (45) is, by Poisson’s identity, 


(46) e€z[Z [bas der]. 


It was shown above that |[¢,, ¢,”] is a first class ¢, so that (46) is a possible 
change in g arising from the equations of motion with a suitable choice of the 
v's, and thus corresponds to a motion in the small A-dimensional space round 
the starting point. This is the condition for integrability. 

If there are supplementary conditions involving the v’s, this integrability 
may get spoiled. Thus the integrability does not necessarily hold for the 
equations of motion derived from a Lagrangian. 

The integration of the small spaces will provide a set of A-dimensional spaces 
lying in the (2N—M)-space such that the motion always takes place entirely 
in one of them. Call these spaces A-spaces. Every curve in an A-space represents 
a possible solution of the equations of motion. Every point of the (2V—M)- 
space lies in an A-space, which contains all the motions starting from that 
point. It would be permissible to consider the A-space itself as the complete 
solution of the equations of motion, rather than a general curve in it. 

Given a particular A-space, we can fix a point of it by a coordinates, each 
of which is some function of the g’s and p’s. Call these coordinates t, (a = 1, 
2,...,A). They will play the role of time variables. The A-space itself can 





-_--— 








GENERALIZED HAMILTONIAN DYNAMICS 143 


be described by giving all the g's and p’s as functions of the ¢,. If g is any qg or 
p, or a function of the q's and p’s, we have 
(47) g = 1, dg/dAta, 
since the r variation of g may be looked upon as arising from the r variation 
of the ¢,’s. Using the Hamiltonian equations of motion (33) for g and i,, we 
get 

Vlg, ba] = Valles ba] OZ/ Ata. 
This equation holds for arbitrary »,, so 


(48) lg, ba] = [tar ba) Og/dte. 
Equations (48) may be looked upon as the general equations of motion that 
fix an A-space. They are the closest equations in the theory with homogeneous 
velocities to the usual Hamiltonian equations of motion. If a = 1, we may 
take the one variable ¢, to be the time and (48) then reduces to precisely the 
usual Hamiltonian equations of motion. 

To pass from the Hamiltonian to the Lagrangian, we introduce the velocities 
gn by the equations 


(49) Qn = 0, 96,/APn, 
and then define L by 
(50) L = Pndn _ H = Pagn ~— V.9.- 


This gives L as a function of the q’s, q's, p's and v's, linear in the g's and 9's. 
Making independent variations 5q, 5g, 5p, 5v, we get 

6L = GnOPn + Pndgn — dabve — va(9gp«/I9Gn-5qn + 96«/OPn.dPn) 
(51) = Pndgn — Va Aba/dGn-5Gn- 
Thus 6Z does not depend on the 6p, and év,. This result is to be compared 
with (6). 

If the equations (49) together with the ¢ equations give the q's as indepen- 
dent functions of the p's and v's, so that they allow the p’s and v's to be con- 
sidered as functions of the g’s and g's, then (51) shows that L is strongly equal 
to a function of the g's and g’s only. This function must be homogeneous of 


the first degree in the g's. Differentiating it partially with respect to a g or g, 
we find 


OL /8Gn Pn 
OL /8qn =— Ue 062/8dn = Pn. 
These are the usual Lagrangian equations. 

If the equations (49) together with the ¢ equations do not give the q’s as 
independent functions of the p’s and v's, they lead to certain equations 
between the q's and q's only, say 
(53) R;(q, qg) = 9, j=1,2,...,J- 
The R’s are homogeneous in the g's and we arrange them to be of the first 
degree. We now proceed by a method analogous to that of §3 with the role of 
p's and q's interchanged. We obtain a result analogous to (9), 


(52) 











144 P. A. M. DIRAC 


(54) L=%+;R;, 

where & is a function of the q’s and q’s only, which must be homogeneous of 
the first degree in the g’s, and the coefficients A; are functions of the q’s, p’s 
and v's. 

We have again equations (52) if the \’s are counted as independent variables 
in the partial differentiations of L, and L is then homogeneous of the first degree 
in the g's. Thus we have a Lagrangian containing momentum variables of 
the type considered at the end of the preceding section, with the previous q; 
corresponding to the present A; and the supplementary conditions (42) giving 
the equations (53). 


10. Application to relativistic dynamics. In the ordinary non-relativistic 
dynamics one works with the state of a dynamical system at a particular 
instant of time, this state being specified by giving values to the q’s and p's. 
One has equations of motion which enable one, given the state at one instant, 
to calculate the state at another instant. These equations of motion, written 
in the Hamiltonian form with homogeneous velocities, need only one first class 
. 

To get a dynamical theory which satisfies restricted relativity, we must set 
up a scheme of equations which applies equally to observers with all velocities. 
If we work with instants, we must include instants with respect to all observers. 
An instant is then any flat three-dimensional surface in space-time having a 
normal in a direction within the light-cone. A general instant needs four para- 
meters to describe it, three to fix the direction of the normal, or the velocity 
of the observer, and the fourth to distinguish different instants for the same 
observer. 

A relativistic dynamics that involves instants must enable one, given the 
state at any of these instants, to calculate the state at any other. We must 
have equations of motion showing how the dynamical variables vary as the 
instant varies. We can allow the instant to vary arbitrarily, with a trans- 
lational motion in space-time as well as the direction of its normal varying, 
and the equations of motion must always apply.: Thus we need four first 
class $'s to give rise to the four freedoms of motion of the instant. 

The four parameters that describe the instant are to be treated as q’s, 
subject to the equations of. motion (17) or (33) along with the cther q’s and 
p’s. They are distinguished from the other g’s and p’s in that it is specially 
convenient to take them as the ¢ variables of the equations of motion (48). 
These equations then show directly how any g or p varies for a given variation 
of the instant. 

There are other forms of relativistic dynamics not involving instants, which 
have been discussed by the author [2]. There is the point form, in which a 
state is defined with reference to a point in space-time. This form also needs 
four first class ¢’s, corresponding to the four freedoms of motion of the point. 
Then there is the front form, which needs three first class ¢’s, corresponding 





wa 


a a aaa sl 


~~ _- ee AS 


a_i = A - -— | 


+ 











LS ee 


GENERALIZED E'AMILTONIAN DYNAMICS 145 


to the three freedoms of motion of a front. Finally, we may take a state to 
be defined on a general three-dimensional space-like surface in space-time. 
There must then be infinitely many first class ¢’s, corresponding to all the 
deformations that may be made in such a surface. With each of these forms 
the variables that describe the point, front, or general space-like surface are 
to be treated as q’s, subject to the equations of motion (17) or (33), and are 
specially convenient to be taken as the ¢ variables of equations (48). 

The first class ¢’s discussed above are the fewest with which one can construct 
a relativistic dynamics in the respective forms. There may be additional 
ones. For example, an electrodynamics which allows gauge transformations 
to be made after one has fixed the initial values of all the g's and p’s must con- 
tain extra freedoms of motion, which will need extra first class ¢’s to give 
rise to them. 


11. Quantization. In order to quantize a dynamical system which one has 
worked out in the classical theory, one must set up a scheme of linear operators 
corresponding to the classical dynamical variables g and p, and to functions of 
them. There are no operators corresponding to the classical variables v, or to 
velocity variables in general, or to anything involving r. The operators all 
operate on the vectors y of a Hilbert space, whose representatives in any 
representation are the wave functions which specify states in the quantum 
theory. Real classical variables correspond to self-adjoint operators. 

The linear operators must be analogous to their classical counterparts in 
accordance with two general principles. Using the same letter to denote two 
things that are counterparts, the principles are 
(i) P. b. relations between the classical variables correspond to commutation 
relations between the operators, according to the formula 


[é,”] corresponds to 2x(& — n&)/ih. 


(ii) Weak equations between the classical variables correspond to linear con- 
ditions on the vectors ¥, according to the formula 


X(q,~) = 0 correspondsto Xy = 0. 


The procedure of passing from the classical to the quantum theory is not 
mathematically well-defined, because whenever a classical quantity involves a 
product of two factors whose P. b. does not vanish, there is an ambiguity in 
the order in which the two factors should appear in the corresponding quan- 
tum expression. In practice with simple examples one finds no difficulty in 
deciding what the order should be. With complicated examples it may be im- 
possible to choose the order in each case so as to make all the quantum equa- 
tions consistent, and then one would not know how to quantize the theory. 
The present-day methods of quantization are all of the nature of practical 
rules, whose application depends on considerations of simplicity. 

There are certain general features of the passage to the quantum theory 
that one must pay attention to, in order that the consistency of the quantum 











146 P. A. M. DIRAC 


equations shall not go wrong in an elementary way. We have in the classical 
theory a number of ¢ equations (counting x equations also as ¢ equations), 
which are to be used in the quantum theory according to the principle (ii). 
We can transform the classical ¢’s linearly by the transformation (27) and the 
new @’s are just as good as the old ones. If we make the corresponding trans- 
formation in the quantum theory, we must take care to put the coefficients 7 
all to the left of the @’s. A general ¢ in the quantum theory is a linear function 
of the given $s with coefficients on the left. 
From two quantum equations obtained from ¢ equations by principle (ii) 


oy = 0, oy = 0, 
we can infer 
ow = 0, didey = 0, 
and hence from principle (i), 
(di, 2.) ¥ = 0. 
This corresponds to the classical weak equation 
(dr, do] = 0. 


We can infer that all the ¢’s must be first class if the passage to the quantum theory 
is possible. 

Given a classical theory with second class ¢’s, one can get a quantum theory 
from it by first applying a transformation of the type described in §8, which 
converts all the ¢s equations into strong equations. The strong equations will 
correspond in the quantum theory to equations between operators, which serve 
to define some of them in terms of the others. 

The quantum equations ¢y = 0, obtained by applying principle (ii) to the 
first class @ equations of the classical theory, are the Schroedinger wave equa- 
tions. The usual classical dynamics with only one first class ¢ leads to only 
one Schroedinger equation. In the general theory there is one Schroedinger 
equation for each classical freedom of motion. The operators in these equations 
all correspond to classical dynamical variables for one r value. The operators 
referring to a different +r value do not belong to the same algebraic scheme, 
and there does not seem to be anything in the quantum theory analogous to 
the r dependence of the classical variables. 

However, the dependence of the classical variables on the parameters /, 
given by equations (48), does have a quantum analogue, provided the ?’s are 
chosen so as to have zero P. b.’s with one another, so that they can be given 
numerical values simultaneously in the quantum theory. The specially con- 
venient ?’s for the various forms of relativistic dynamics discussed in the pre- 
ceding section do satisfy this condition. We cannot immediately take over 
equations (48) into the quantum theory because, as easily verified, the equa- 
tions that we should get would not be invariant under a general linear trans- 
formation of the ¢’s (27). We must first put equations (48) in a standard 
form. By a transformation (27) we introduce a new set of ¢’s, ¢, say, in one- 
one correspondence with the #,’s, so that 





Lg 





GENERALIZED HAMILTONIAN DYNAMICS 147 


(55) ta, Gar] = Saar 
With these ¢’s equations (48) reduce to 
(56) [g, ba] = Og/2ta. 


These equations can be taken over into the quantum theory, and are then 
Heisenberg’s quantum equations of motion for the present generalized 
dynamics. 


Appendix. Proof of Poisson's identity for the new P. b.’s defined by (36). 
Use the suffixes r, s, t, ... to distinguish different 6’s. We have by the 
definition 


([é.m]*, 5]* 
= [[En] + [€,6r}cre[O..n),5] + [én] + [€,0-)c-6[6.,0), OrleeulOust] 
(57) = [LE], ] + [16,0], SherslO.,m] + [E,4-McrasFIGs,0) + [€,0r]ersf[8..0),5] 
+ [LE], OcheeulOus8] + [LE,4r], OrkerlO.,m]crulust] 
+ [EJ [ero MO. m)ceulOust] + [€,0r}cr[l8..n], OclceulO.,f]. 


Let the operator >> denote the application of the three cyclic permutations 
of £, », ¢ and the summation of the three results. Then we have to prove that 


Lilé.n)*.s]* = 0. 
>< applied to the first term of (57) gives zero, from the ordinary Poisson's 
identity. >> applied to the second, fourth and fifth terms gives 
Ler elGan) [1E,0-], £1 + [6.51.4] + [I¢,€]0]} = 0, 
from the ordinary Poisson’s identity again. >> applied to the sixth and eighth 
terms of (57) gives, with a cyclic permutation of r, u, s, t in the latter, 
(58) Craton [Oe,mMlOusd){ [1E,6r1,01] + [0,8], Or]} = —Creceu [s,m] [Oust II10,,01),E)- 


From (35) we can infer 
[Ceul@,-,O:],€] = 0 
or 


(59) [Ceur)[6,,0:] + Crul[O,,0¢),€] _ 0. 
Thus (58) reduces to 


Crs(O,,O:) > [00,0)[6u iVerw »é] —_ LlO..n) [Ou Tcerw él, 


with a further use of (35). This cancels with }> applied to the third term of 
(57). > applied to the remaining term of (57), the seventh, gives 


(60) [E,O-]in,Oe)If,Ou){ Ceulcrs,Oe) + CtrlCousO1] + CtslCursOi)} ° 


Using >°-s, to denote the operation of applying the three cyclic permutations 
of r, s, # and simultaneously of r’, s’, u’ and adding the three results, we have 
from the ordinary Poisson's identity 











148 P. A. M. DIRAC 


(61) >, Cr le lw ul lO Oe) ,0u) = 0. 
From (59) with é replaced by @,. 


[Cr Ou lO» 0) + Crr[lO-,00],0u'] = 0, 
so (61) gives 
DreulesCw ulOy Oe [Cr Ow) = 0. 


With the help of (35), this reduces to 


> Cu’u [Crs,Ou') = 0, 


which shows that (60) vanishes. This completes the proof. All the above 


equations may be written as strong equations, as no weak equations are used 
in the proof. 


REFERENCES 


[1] P. A.M. Dirac, Homogeneous variables in classical dynamics, Proc. Camb. Phil. Soc., 
vol. 29 (1933), 389. 


[2] , Forms of relativistic dynamics, Rev. Mod. Phys., vol. 21 (1949), 392. 





St. John’s College, Cambridge 








yve 











PERIODIC LINEAR TRANSFORMATIONS OF AFFINE 
AND PROJECTIVE GEOMETRIES 


ERNST SNAPPER 


Introduction. Ina paper called ‘‘A Theorem in Finite Projective Geometry 
and some Applications to Number Theory” [Trans. Amer. Math. Soc., vol. 43 
(1938), 377-385], J. Singer proved that the finite projective geometry PG(s — 1, 
p”), that is the projective geometry of dimension s — 1 whose coordinate field 
is the Galois field GF(p"), admits a collineation L of period g = (p**—1)/ 
(p* — 1). Since this g is the number of points of PG(s — 1, p*), Singer's 
result states that the points of PG(s — 1, p") are cyclically arranged. Singer's 
construction of L uses the notion of a “primitive irreducible polynomial of 
degree s belonging to a field GF(p") which defines a PG(s — 1, p").”” This con- 
struction was presented at a seminar in foundations of geometry which Pro- 
fessor H. S. M. Coxeter conducted at the University of Southern California 
in the summer of 1948, and the author observed that the same L could be 
obtained by a more fundamental and less complicated algebraic construction, 
notably without the use of the difficult notion of the above primitive irreduc- 
ible polynomial. Professor Coxeter judged that this construction, which 
applies equally well to periodic linear transformations of affine geometries 
and of infinite geometries, is of geometric interest and requested that it be 
written up for publication. 


1. Arbitrary affine and projective geometries. We first discuss the case 
of the arbitrary affine geometry EG(s,K), that is the affine geometry whose 
dimension is s and whose coordinate field is K. (K may be infinite.) The 
elements of EG(s,K), which will be denoted by lower case Latin letters, are 
the (s,K)-vectors, that is the row vectors of length s whose components belong 
to K. A linear transformation L of EG(s,K) is a transformation of EG(s,K) 
into itself which satisfies the properties: 


(1) (v+w)L = (v)L + (w)L, for v,w, € EG(s,K). 
(2) (av)L = a( (v)L), for a €K and v € EG(s,K). 
The powers L* of L for non-negative integers h are defined in the usual way, 
and L® is the identity transformation. A linear transformation L is periodic 
of period g if g is the smallest positive integer such that L’ = L®. In order to 
construct periodic linear transformations of EG(s,K), we consider any algebra 


A, not necessarily commutative, of finite rank s with respect to K. (For 
example, we may choose for A an extension field of K of finite field degree 


Received September 17, 1948. 








150 ERNST SNAPPER 


s = [A:K] over K.) We denote the elements of A by lower case Greek letters 
and choose a fixed K-basis in A. This gives rise, in the usual way, to a (1-1)- 
correspondence between the elements of A and the vectors of EG(s,K). Since 
under this (1-1)-correspondence all notions of linearity remain invariant, a 
K-linear transformation T of A into itself (i.e. (a + 8)T = (a)T + (8)T and 
(xa)T = «( (a)T) for a, 8 € A and « € K) corresponds to a linear transforma- 
tion L of EG(s,K). In particular, if \ is a fixed element of A, the K-linear 
transformation T of A which is defined by (a)T = ad for any a € A corres- 
ponds to a linear transformation L of EG(s,K). (Clearly, all we are doing is 
considering EG(s,K) as the representation space of the regular representation 
of A.) In this case, the power of 7” of T is defined by (a)7* = ad*; hence T 
(and consequently L) is periodic of period g if and only if g is the smallest 
integer such that \% is a right unit of A. Since, in case A is an extension 
field of K, this is equivalent to saying that \ is a primitive gth root of unity, 
we have proved the following statement. 


STATEMENT 1.1. Let K be a field and let A be an extension field of K of field 
degree s = |A:K]. Then, if A contains a primitive gth root of unity i, the affine 
geometry EG(s,K) admits a periodic linear transformation of period g. 


In the same way, we can construct periodic linear transformations of the 
arbitrary projective geometry PG(s—1,K), that is the projective geometry of 
dimension s — 1 with coordinate field K. Here, the elements of PG(s —1,K) 
are the classes of non-zero, K-proportional (s,K)-vectors. Let A again be 
any algebra of finite rank s with respect to K for which a fixed K-basis has been 
chosen. We then obtain a (1l-1)-correspondence between the elements of 
PG(s — 1,K) and the classes of non-zero, K-proportional elements of A which 
leaves the notions of linearity invariant. In particular, the right multiplica- 
tion T of the elements of A by a fixed element \ of A corresponds to a linear 
transformation L of PG(s — 1,K), and T* corresponds to the right multipli- 
cation of the elements of A by \*. However, T® is the identity transformation 
on the above classes of elements of A, if and only if (a)T® = ad” = ap(a), 
where p(a) is an element of K depending on a. If A contains K and has a 
left unit €, as is the case when A is an extension field of K, we obtain for a = « 
that 4° € K. Conversely, if 4° € K, then certainly (a)T’ = ap(a) where 
p(a) € K, since we can then choose p(a) = \2 for alla € A. Hence we have 
proved the following statement. 


STATEMENT 1.2. Let K be a field and let A be an extension field of K of field 
degree s = [A:K]. Then, if there exists an element \ € A and a positive inte*e 
g such that d* is the smallest power of which lies in K, then PG(s— 1, K) 
admits a periodic linear transformation of period g. 


2. Finite affine and projective geometries. We now study the finite geo- 
metries EG(s,p") and PG(s — 1,p"), where the notation is the same as above, 











—_——— SE 


AFFINE AND PROJECTIVE GEOMETRIES 151 


except that we write"p” to indicate that the coordinate field is now the Galois 
field GF(p"). We choose for the algebra A of the previous section the Galois 
field GF(p*"), which is an extension field of GF(p") of degree s. (See for in- 
stance van der Waerden, Moderne Algebra, vol. 1, sec. 37.) The non-zero ele- 
ments of GF(p*") form a cyclic group & of order p** — 1 and the non-zero ele- 
ments of GF(p") form a cyclic subgroup & of & of order p* — 1. We choose 
for the A of the previous section any generator of &. We then know from the 
theory of cyclic groups that: 


(2.1) The smallest integer g such that \” is the unit element of Wf is p** — 1. 
Hence \ is a primitive gth root of unity, where g = p** — 1. 


(2.2) The smallest integer g such that A* € & is (p** — 1)/(p* — 1). Hence 
q is the smallest power of \ which lies in GF(p"), where gq = (p** — 1)/(p"— 1). 


Statements 1.1 and 2.1 prove Theorem 1 and statements 1.2 and 2.2 prove 
Theorem 2, following. Theorem 2 is Singer’s theorem. The case s = 2 of 
Theorem 1 was proved by R. C. Bose, “An Affine Analogue of Singer’s Theor- 
em,”’ [J. Indian Math. Soc. (N.S.), vol. 6 (1942), 5]. 


THEOREM 1. EG/(s, p") admits a periodic linear transformation of period 
p** — 1. 


THEOREM 2. PG(s — 1, p") admits a periodic linear transformation of 
period (p** — 1)/(p" — 1). 


University of Southern California 











ELEMENTARFUNKTIONEN AUF RIEMANNSCHEN 
FLACHEN ALS HILFSMITTEL FUR DIE FUNKTIONEN- 
THEORIE MEHRERER VERANDERLICHEN 


HEINRICH BEHNKE UND KARL STEIN 


In the present paper, the authors extend the Cousin theorems and the continuity theorem, 
using some previous results on analytic functions connected with open Riemann surfaces. 

The Cousin theorems, concerning the existence of analytic functions of several complex 
variables with prescribed poles and zeros in a given domain, have been generalized in various 
manners, but only in the case where the domain is schlicht. The authors proceed to the case 
where the given domain 3 is the direct product of m open Riemann surfaces. They prove the 
following two theorems. 


(1) If in 3 we prescribe locally meromorphic functions (i.e., if we cover 3 with a denumer- 
able set of overlapping neighbourhoods N,, w= 1, 2,..., in each of which a meromorphic 
function f, is defined such that in the intersections NV, n N,, f, —f, is regular), then it is possible 
to determine a meromorphic function f, defined in 3, such that in every N,, f — f, is regular. 

(2) The components of 3 may be simply connected with at most one exception. If in every 
N, a regular function F, is given such that in NV, N,, F,/F, is tegular and non-vanishing, 
then there exists a regular function F, defined in 3, such that in every N,, F/F, is regular and 
non-vanishing. 

The continuity theorem represents the basis of various investigations on singularities of 


analytic functions of several complex variables. In its most general form it previously has 


been proved only for regular functions. Now the authors generalize to the case of meromorphic 
functions. 


In ihrer Arbeit: ‘‘Entwicklungen analytischer Funktionen auf Riemann- 
schen Flachen’ haben die Verfasser fiir beliebige offene Riemannsche Flachen 
Elementarfunktionen und Elementarintegrale konstruiert, um mit ihrer Hilfe 
Aussagen iiber die Approximierbarkeit von Funktionen f(z) auf diesen Flachen 
durch spezielle Funktionen nachzuweisen. Es zeigt sich nun, dass mit diesen 
Hilfsmitteln und Aussagen auch grundsiatzliche Theoreme der Funktionen- 
theorie mehrerer Veranderlichen wesentlich erweitert werden kénnen. 

P. Cousin bewies in seiner Pariser These vom Jahre 1895,? dass in jedem 
schlichten Zylindergebiet im Raume von m komplexen Veranderlichen zu lokal 
vorgegebenen Polstellenflachen stets eine meromorphe Funktion mit diesen 
Polstellen konstruiert werden kann. Eine entsprechende Aussage wurde von 
ihm iiber die Existenz regularer Funktionen m. V. zu vorgegebenen Nullstellen 
aufgestellt. Doch hat T. Gronwall* einen Fehler in dem hierauf beziiglichen 
Beweise von Cousin aufgewiesen und gezeigt, dass die Giiltigkeit dieser 

Received October 25, 1948. 

1 Math. Ann., vol. 120 (1948). 

*Siehe W. F. Osgood, Lehrbuch der Funktionentheorie 11, 1,2. Auflage (1929), 248 ff. 

*T. H. Gronwall, “On the Expressibility of a Uniform Function of Several Complex Variables 
as the Quotient of Two Functions of Entire Character,"’ Trans. Amer. Math. Soc., vol. 18 (1917). 


152 


, 












17). 





— 





ELEMENTARFUNKTIONEN AUF RIEMANNSCHEN FLACHEN 





153 


zweiten Cousinschen Aussage auf den Fall beschrankt bleibt, dass alle Kom- 
ponenten des vorgegebenen schlichten Zylindergebietes bis auf héchstens eine 
einfach zusammenhangend sind. Spater haben sich verschiedene Autoren mit 
der Ausdehnung der Cousinschen Resultate beschaftigt. Allen diesen Unter- 
suchungen ist jedoch gemeinsam, dass die Cousinschen Probleme nur in 
schlichten Gebieten des R:, betrachtet werden. Im folgenden sollen sie nun in 
nichtschlichten Zylindergebieten iiber dem R:, studiert werden. Als Kom- 
ponenten eines solchen Zylindergebietes sind beliebige nichtgeschlossene Rie- 
mannsche Flachen mit Verzweigungspunkten im Innern zugelassen. Es wird 
bewiesen, dass die erste Cousinsche Aussage fiir diese Gebiete uneingeschrinkt, 
die zweite Aussage fiir den Fall giiltig bleibt, dass alle Komponentenflachen bis 
auf héchstens eine einfach zusammenhdngend sind. In der Cousinschen For- 
mulierung des Problems kann also die Voraussetzung der Schlichtheit fallen 
gelassen werden. Fiir den Spezialfall » = 1 lauten die hier angegebenen Er- 
weiterungen der Cousinschen Satze: (1) Zu jeder offenen Riemannschen Flache 
kann man eine dort meromorphe Funktion finden, die an beliebig vorgege- 
benen, im Innern sich nicht haufenden Stellen einen vorgeschriebenen Haupt- 
teil hat und iiberall anderswo auf der Flache regular ist. (Obertragung des 
Satzes von Mittag-Leffler.) (2) Zu jeder offenen Riemannschen Flache kann 
man eine dort iiberall regulare Funktion finden, die an beliebig vorgegebenen 
Stellen, die sich im Innern nicht haufen, von vorgeschriebener Ordnung ver- 
schwindet und iiberall anderswo auf der Flache ungleich Null ist. (Obertra- 
gung des Satzes von Weierstrass.)* 

Der Kontinuittdtssatz bildet die Grundlage der Untersuchungen iiber die 
Singularitaten analytischer Funktionen m.V.° Er geht in seiner einfachsten 
Gestalt auf F. Hartogs und E. E. Levi zuriick und wurde von anderen Au- 
toren spater allgemeiner formuliert. In einem wichtigen besonderen Falle 
lasst er sich folgendermassen aussprechen: ‘‘Gegeben sei eine Folge paralleler 
analytischer Ebenen © im Rep», die gegen eine Grenzebene € konvergieren. 
Auf den © und auf € mégen beschrankte Gebiete SB, bzw. 8 mit den Randern 
€, bzw. € liegen, und zwar mégen die B, gegen B konvergieren. Ist dann 
eine Funktion f(z, ...,2,) regular (bzw. meromorph) und eindeutig auf allen 
%, sowie auf ©, so ist sie regular (bzw. meromorph) in das Innere von 8 vom 
Rande aus fortsetzbar.”” H. Behnke hat gezeigt, dass der Satz fiir reguldre 
Funktionen richtig bleibt, wenn an die Stelle der Ebenen &, bzw. € beliebige 
zweidimensionale analytische Flachen §, bzw. § treten.* Der Beweis beruht 
auf der Eigenschaft der Regularkonvexitat von Regularitatsbereichen und auf 
dem Maximumprinzip fiir regulare Funktionen; er ist daher auf den Fall mero- 


‘Siehe auch Herta Florack, Regulére und meromorphe Funktionen auf nichigeschlossenen 
Riemannschen Flichen, Dissertation, Miinster 1948. 

‘Siehe H. Behnke und P. Thullen, Theorie der Funktionen mehrerer Komplexer Verdnder- 
lichen, Erg. d. Math. 3, 3 (1934), abgekiirzt B.-Th. Bericht. 

‘H. Behnke, “Der Kontinuitdtssatz und die Regularkonvexitat,"’ Math. Ann., vol. 113 
(1936), 392-397. 











154 BEHNKE UND STEIN 


morpher Funktionen nicht iibertragbar. Im folgenden soll nun fiir diesen 
verallgemeinerten Kontinuitatssatz ein weiterer, sehr einfacher Beweis gege- 
ben werden, der sich auf meromorphe Funktionen ausdehnen lasst. Aus- 
gangspunkt ist der Beweisgang von Hartogs, wobei dann allerdings an die 
Stelle der Cauchyschen Integralformel fiir schlichte Gebiete ein verallgemei- 
nertes Cauchysches Integral auf Riemannschen Flachen tritt, in das die Ele- 
mentarfunktionen dieser Flachen eingehen. Auch das Beweisverfahren von 
H. Kneser’ fiir den Kontinuitatssatz fiir meromorphe Funktionen lasst sich 
so tibertragen. So folgt: Der verallgemeinerte Kontinuitatssatz in der Fassung 
von H. Behnke gilt auch fiir meromorphe Funktionen. Die zahlreichen Aus- 
sagen, die bisher aus dem verallgemeinerten Kontinuitatssatz gefolgert, aber 
nur fiir regulare Funktionen ausgesprochen werden konnten, gelten demnach 
uneingeschrankt auch fiir meromorphe Funktionen.* 


I. Funktionen und Integrale auf nichtgeschlossenen Riemannschen Flachen. 
Wir geben zunichst eine Ubersicht iiber die Resultate, die aus der zitierten 
Arbeit der Verfasser' bendtigt werden. 


Satz A;. Sei R eine nichtgeschlossene Riemannsche Fliche. Dann existiert 
im 4-dimensionalen Gebiet 3 = Rr X NR. eine Funktion A({,2) mit folgenden 
Eigenschaften * 

(1) A({,s) ist in 3 meromorph und eindeutig. 

(2) Sind +r, t den Punkten »({’), p(z’) von Rr bew. R, zugeordnete ortsuni- 
formisierende Parameter, so ist in einer Umgebung von p({',2’) = p(t’) X plz’): 


A(gs) & = —_ + R(x), falls p(t’) = p(2’); 
dr r—t 
de , ' 

A(f,z) r = S(r,t), falls p(t’) ¥ plz’); 


Dabei bedeuten R(r,t) und S(r,t) jeweils in einer Umgebung von (r,t) = (0,0) 
reguldre Funktionen. 
A({,z) heisse eine Elementarfunktion 1. Ordnung auf ®. In der Nahe von 


7H. Kneser, “Der Satz von dem Fortbestehen der wesentlichen Singularitaten einer ana- 
lytischen Funktion zweier Veranderlichen,” Jber. deutschen Math. Verein., vol. 41 (1932), 
164 ff.; und “Ein Satz uber die Meromorphiebereiche analytischer Funktionen von mehreren 
Veranderlichen,” Math. Ann., vol. 106 (1932), 648 ff. 

*Herr W. Rothstein hat vor einigen Jahren schon einen andere Mittel benutzenden Beweis 
des verallgemeinerten Kontinuitatssatzes fiir meromorphe Funktionen vorgelegt, der wegen 
der besonderen Verhaltnisse erst jetzt erscheinen konnte. Siehe W. Rothstein, “Die invariante 
Fassung des Kontinuitatssatzes fiir meromorphe Funktionen, "Archiv der Mathematik, vol. 1 
(1948). 

*3 ist also das direkte Produkt von §f mit sich selbst. Durch die Schreibweise R; bzw. 
R, werde angedeutet, dass die Variable einmal {, das andere Mal z heissen soll. Zur Kenn- 
zeichnung eines Punktes von R; bzw. R, bzw. 3 = R; XR, benutzen wir die Symbole 
pt) baw. p(s) baw. p(tz) = pt) X pie). 


~~ 











1. 


rt 


nn- 
ole 











ELEMENTARFUNKTIONEN AUF RIEMANNSCHEN FLACHEN 155 





1 
oust Daher kann die Cau- 


chysche Integralformel fiir schlichte Gebiete wie folgt tibertragen werden: 

Satz A». Es sei & ein Gebiet mit stiickweise glattem” Rand € ganz im Innern 
der nichtgeschlossenen Riemannschen Fliche R. A(ft,z) sei eine Elementar- 
funktion 1. Ordnung auf R. Dann gilt fiir jede in G eindeutige, regulire, auf 
€ noch stetige Funktion f(z): 


fie) = — | SAK addE. 
21 


p(t) = p(z) verh4lt sie sich im wesentlichen wie 


Auf Riemannschen Flachen gilt folgende Ubertragung des Rungeschen 
Approximationssatzes: 

Satz Bi. G und & seien Gebiete auf der nichtgeschlossenen Riemannschen 
Flache R, und es sei@ Teilgebiet von & (in Zeichen@C G). @ sei ferner relativ 
zu & einfach susammenhdangend. Dann ist jede in eindeutige reguldre Funktion 
f(z) gleichmdssig im Innern von © durch Funktionen approximierbar, die in & 
eindeutig und regular sind. 

Hierzu ist die Definition des relativ einfachen Zusammenhanges zu geben: @ 
heisst zu G relativ einfach zusammenhdngend, wenn jedes endliche System ge- 
schlossener Jordankurven in G, das in & berandet, schon in @ berandet. 

Fiir Funktionen mehrerer Veranderlichen gilt: 


Satz B;. Der Zylinderbereich 3 = Ri KX... K Rn tiber dem Raume der n 
komplexen Verdnderlichen 2,...,2%n, sei Teilbereich eines Zylinderbereiches 
B= X...X Rn, dessen 2;-Projektionen Rj, j = 1,...,n, Teilbereiche 
nichtgeschlossener Riemannscher Flichen R*; seien. Jede Projektion R; sei relativ 
zu Rt; einfach zusammenhangend. Dann ist jede in 3 eindeutige reguldre Funk- 
tion f(zi,..+,2n) tm Innern von 3 gleichmdssig durch in 3 eindeutige regulare 
Funktionen approximierbar. 

Ferner benétigen wir eine Aussage iiber die Existenz von Integralen 1. 
Gattung auf nichtgeschlossenen Riemannschen Flachen zu vorgegebenen 
Periodizitatsmoduln : 

Satz C. Sei R eine nichtgeschlossene Riemannsche Fliche und G,.. ., C,, 
... ein System einfach geschlossener, orientierter Kurven auf R, das eine eindi- 
mensionale Homologiebasis (bezogen auf den Ring der ganzen Zahlen als Koeffi- 
sientenbereich) von R darstelle. Den ©, seien beliebig komplexe Zahlen a, zuge- 
ordnet. Dann gibt es eine auf R uneingeschrankt regular fortsetzbare Funktion 3(z), 
die sich bei Fortsetzung lings eines ©, oder einer zu ©, homologen geschlossenen 
Kurve um die Konstante a, vermehrt. 

S(z) heisse ein Integral 1. Gattung mit den Periodizitatsmoduln™ a,. 

‘Wir sprechen von einem stiickweise glatten Kurvenstiick, wenn es aus endlich vielen reell 
analytischen Kurvenstiicken besteht. 

“Der Begriff des Integrals 1. Gattung auf einer offenen Riemannschen Flache ist in man- 


chen neueren Arbeiten enger gefasst worden. Siehe etwa H. Hornich, “Uber transzendente 
Integrale erster Gattung,” Monatshefte Math.-Phys., vol. 47 (1939), 380 ff. 








156 BEHNKE UND STEIN 


II. Die Cousinschen Siatze fiir direkte Produkte nichtgeschlossener 
Riemannscher Flachen. Es sei@ ein Gebiet iiber dem Raum der komplexen 
Veranderlichen 2,...,2,. Jedem Produkt P von G sei ein Umgebung U(P) 
und eine in U(P) meromorphe Funktion fp(z,...,2%,) zugeordnet, derart, 
dass im Durchschnitt D(U(P), U(Q)) der Umgebungen zweier Punkte P und 
Q von G die Funktion 

fro = fr — fo 
regular ist. Wir sagen, es sei in G eine Cousinsche Verteilung meromorpher 
Ortsfunktionen mit Aquivalenz in bezug auf Subtraktion vorgegeben . Gefragt 
ist, ob eine im Gesamtgebiet G@ meromorphe Funktion F(z, ..., Z,) existiert, 
derart dass in U(P) jeweils F(z,..., 22) —fp(z,..., Zn) regular, d.h., dass F mit 
allen fp Aquivalent in bezug auf Subtraktion ist. Existiert fiir jede solche 
Cousinsche Verteilung meromorpher Ortsfunktionen in @ eine zugehdérige 
Funktion F(z, ..., Zn), so sagen wir, in G gelte die erste Aussage von Cousin". 

Ist jedem Punkt P von @ in einem U(P) eine regulére Ortsfunktion fp zuge- 
ordnet, sodass in D(U(P), U(Q) ) jeweils fp/fg regular und ungleich Null ist, 
so sprechen wir von einer Cousinschen Verteilung regularer Ortsfunktionen 
in @ mit Aquivalenz in bezug auf Division. Lasst sich hierzu stets eine in G 
reguidre Funktion F(2,...,2n,) finden, die mit allen fp Aaquivalent in bezug 
auf Division ist, so sagen wir, in G gelte die zweife Aussage von Cousin. 

Als Gebiet @ wahlen wir im folgenden das direkte Produkt von n 
nichtgeschlossenen Riemannschen Flachen §, . . . , tn; ein solches Gebiet heisst 
ein Zylindergebiet. Verzweigungspunkte endlicher Ordnung und unendlich 
ferne Punkte sind als innere Punkte jedes R; zugelassen ; die laufende Variable 
von tj sei mit z; bezeichnet. Regularitat und Meromorphie wird stets auf 
die Ortsuniformisierenden der einzelnen Riemannschen Flachen bezogen. 
Eine Funktion ¢$(2;, . . . , zn) heisst also in einem vorgegebenen Punkte P von 
@ regular bzw. meromorph, wenn sie in bezug auf die P zugeordneten Orts- 
uniformisierenden regular bzw. meromorph ist. 

Satz 1. In jedem Zylindergebiet 3 = Ri X .. KX Rn, wo die Rj nichtge- 
schlossene Riemannsche Fldchen seien, gilt die erste Aussage von Cousin. 

Der Beweis von Cousin fiir den Fall schlichter Zylindergebiete besteht aus 
zwei wesentlichen Schritten: einem Heftungsverfahren und einem Approxi- 
mationsprozess.? Wir zeigen, dass diese beiden Schritte auch in unserem Falle 
ausgefiihrt werden kénnen. 

Hierzu beweisen wir den folgenden 


Hitrssatz A. Es seien B,, 8*; abgeschlossene Gebiete im Innern von %:, 


ferner Bo,..., Bn abgeschlossene Gebiete in Ro,..., bzw. Rn. Der Durch- 
schnitt R von B, und B*, bestehe aus endlich vielen zueinander punktfremden, nicht- 
geschlossenen, stiickweise glatten Kurvenstiicken Ro, p= 1,...,k (die also zu den 
Randern von B, und B*; gehéren). Ferner seien f(z, ..., 2n) bw. f*(z,...,2n) 


8Siehe H. Behnke und K. Stein, ‘“Analytische Funktionen mehrerer Veranderlichen zu 
vorgegebenen Null-und Polstellen,” Jber. Deutschen Math. Verein., vol. 47 (1937), 177-192. 





———_ 





— aa \“ er 2? 














ELEMENTARFUNKTIONEN AUF RIEMANNSCHEN FLACHEN 


157 
in 3o = Bi X B. X rT xX Bn baw. 3% = B*, X BX... X B, meromorphe 
Funktionen, die auf R X G2 X ... K Bn dquivalent in bezug auf Subtraktion 
seien. Dann existiert im Vereinigungsgebiet 

Bo t+ 3% = (G+ Bt) XBx...xB, 
eine meromorphe Funktion G(z,..., Zn), die in Bo mit f(z,..., Zn), im 3*o mit 


f*(ai,...,2n) Gquivalent in bezug auf Subtraktion ist. 


Beweis. Wir setzen auf R X BX... X Ba 
o(%i,...,8n) = f(%,...,8n) — f*(t,... 5 Bn); 


¢ ist dort nach Voraussetzung regular. Dann bilden wir 


(z1,...,2n) = Z| o(tm «++ 8n) A(S,e)do. 

2ni J 

R* 
Hierin bedeutet A(f{,z,) eine Elementarfunktion 1. Ordnung auf ®;. Integriert 
wird iiber eine Menge &* von Kurvenstiicken *,, die aus den &, auf folgende 
Weise erhalten werden: Jedes &» wird an seinen Enden durch Hinzufiigen 
kleiner Kurvenstiicke erganzt, die im Aussengebiet von %, und $*; liegen, 
jedoch so, dass die Punktmenge #* XK BG: X ... XB, noch ganz zum Regu- 
laritatsgebiet von ¢(2:, . . . ,2n) gehért. Insbesondere sollen also die Endpunkte 
der &*, ausserhalb von %; und %*; liegen. (2, ..., zn) stellt dann in Ro 
eine regulare Funktion ,(z:,...,2,) und in 3*» eine regulare Funktion 
®*, (2:,...,2n) dar. Da A(f,z,) sich auf R*; x a wie 1/¢ — x verhalt, 
bleiben @, und #*, bei Fortsetzung aus dem Innern von 39 bzw. 3*o nach 
R X BX ...X B, regular, und zwar unterscheiden sie sich dort additiv um 
o(2,...,%n), d.h. bei geeigneter Orientierung der §*, gilt auf R X BX... X Ba: 


o(81, -:-,8n) = Oi(m, ..., Sn) — O%1(8, ..- » Sn) 
oder f(z:,...,2n) — f*(ti,...,2n) = ®ilti,..., fn) — O*1(t1,..., Sn). 
Setzen wir nun 
g(ti,...,8n) = f(z... 80) — Pili, ..., Zn) in Bo, 
g*(2i,...,%n) = f*(hi,..., 20) — O(a, ..., Bn) in 2%, 
so ist auf R X B. X... K Ba: 
g(t, ..- Sa) = gm, . . - » Sa)- 


Daher stellt = 
(z1,.. ° fie in 2505 
CO «++ sia) = hay oc Sa) in om 

eine gesuchte Funktion in 3o + 3*o dar. 

Zum Beweise von Satz 1 wird nun 3 = ®: X ... X Rn approximiert durch 
Zylindergebiete 3, = Ri” K ... XK Rx” mit folgenden Eigenschaften: 

(1) Jedes R;” liegt ganz im Innern von §; und ist ein relativ zu R,; ein- 

fach zusammenhangendes Polygongebiet; 
(2) Jedes R;” ist echtes Teilgebiet von R;” *”. 











158 BEHNKE UND STEIN 


(3) Es ist lim n #;” = Rj. 


Dann kann zu einer in 3 vorgegebenen Cousinschen Verteilung meromorpher 
Ortsfunktionen mit Aquivalenz in bezug auf Subtraktion zunidchst je eine 


Lésungsfunktion F,(z:,...,%n) in jedem 3, auf folgende Weise konstruiert 
werden: Jedes R,” words trianguliert; die Elementardreiecke der Zerlegung 
von %;” seien etwa G, aj ®  «j =1,..., m;. Die Zerlegungen seien so fein 


gewahlt, dass jedes Gebiet G,.,” X &..” x... & Enen” ganz von einem 
zur vorgegebenen Cousinschen Verteilung gehérigen li(P) iiberdeckt wird; zu 
jedem Gx,” X .-- X Gan” gehdrt also eine Ortsfunktion f,,,..., on 
(z,...,2n), die dort bereits eine Lésungsfunktion darstellt. Ferner seien die 
Dreiecke &;,.;” fiir jedes j so durchnumeriert, dass stets jeweils &;,,;° mit 
dem Komplex &;” + ...+ &..;”, 2S «; S mj, langs wenigstens einer 
Seite oder Ecke, niemals jedoch langs aller drei Seiten zusammenhangt. Dass 
eine solche Numerierung der Dreiecke eines zusammenhangenden, endlichen 
Dreieckskomplexes einer nichtgeschlossenen Riemannschen Flache immer mé- 
glich ist, folgt in einfacher Weise durch vollstandige Induktion nach der An- 
zahl der Dreiecke. Die Funktionen f,,,..., «,(%1,---,8n) werden nun, in 
Analogie zum Vorgange von Cousin, durch endlich haufige Anwendung des 
Hilfssatzes A zu einer Lésungsfunktion F,(z;, ... ,2,) in 3, verheftet. 

Aus den F,(2;, . . . , Zn) wird sodann, ebenfalls in Analogie zum Cousinschen 
Approximationsverfahren, eine Lésungsfunktion F(z, ...,2,) im Gesamtge- 
biet 3 gewonnen. Es sei in 3,: 


Git, «.<oM) @ Piatt, ..-»fe) — PA, . +5 fad 


Die Funktion d,(z, ..., Zn) ist in 3, regular, da F,,, und F, dort aquivalent 
in bezug Subtraktion sind. Nach Satz B, kénnen wir eine in 3 regulare Funk- 
tion r,(z:,...,2n) finden, sodass in 3, gilt 


| d,(21, eee > fed — r,(Z1, eee 








Wir bilden nun 


Fi(ai, . . - » Sn) aa { Fa(a:,...,8n) — Filzi,..-,2n) — ri(ti,.-- 8n)} +... 
+ {Frai(z.,..-,2n)— Fi(ti,..-52n) — rr(t1,---,80)} +... 


Diese unendliche Reihe konvergiert in jedem ganz im Innern von 3 gelegenen 
Gebiet nach Abtrennung von je endlich vielen Gliedern absolut und gleich- 
mAssig, sie stellt also eine in ganz 3 meromorphe Funktion dar. Diese ist 
nach der Konstruktion der F,(2z,..., Zn) eine gesuchte Lésungsfunktion zur 
vorgegebenen Cousinschen Verteilung. Damit ist Satz 1 bewiesen. 

Uber die Lésbarkeit des zweiten Cousinschen Problems sagt aus 


Satz 2. In jedem Zylindergebiet 3 = Ri X... XK Rn, wo die R; nicht- 
geschlossene Riemannsche Fldichen sind, die bis auf héchstens eine einfach zusam- 
menhdngen, gilt die zweite Aussage von Cousin. 





V 
C 
f 
d 





o-—_— —— 








ELEMENTARFUNKTIONEN AUF RIEMANNSCHEN FLACHEN 159 

Der Beweis verlauft véllig analog zum Beweise von Satz 1. An die Stelle 
von Hilfssatz A tritt ein entsprechender Hilfssatz B, in welchem jetzt f, f* und 
G reguldre Funktionen bezeichnen, die Aaquivalent in bezug auf Division sind. 
Ferner miissen S:,..., 8, als einfach zusammenhangend vorausgesetzt wer- 
den. Zum Beweise dieses Hilfssatzes wird auf R X B. X ... X Ba gebildet 


fla, eee » Zn) 
¥(s:,...,82) = log ———_—_"“—_ ;xyx 
‘ f*(e, .. +» Bn) 
dann ist (2, .. 7 Zn) dort wegen des einfachen Zusammenhanges der ®:, 


..., Bn und der &, eindeutig, falls ein fester Zweig des Logarithmus gewahlt 
wird. Ferner sei gesetzt: 


1 
x(S:,.-.,8n) = aes ..+, Sn)A(f,2)de. 
2ni J 
R* 
Dann ist, ahnlich wie oben, 


G(s:,..-,%n) = { fle, .. +520) .e7*@ ">" in Bo, 


fm, ..-, Ma) 6° *°*' *) in 2*o, 
eine zu konstruierende Funktion G(2, ...2,) in Bo + B*>. 

Ist nun in 3 eine Cousinsche Verteilung regularer Ortsfunktionen mit Aqui- 
valenz in bezug auf Division vorgegeben, so lassen sich, wie im Beweise zu 
Satz 1, durch geeignete Verheftung von Ortsfunktionen unter wiederholter An- 
wendung des Hilfssatzes B Lésungsfunktionen F,(%,...,2,) in approximie- 
renden Zylindergebieten 3, = Ri” XX... X Ra” konstruieren. Dabei sei- 
en die R,;” mit den gleichen Eigenschaften wie oben gew4hlt. 


Zum Approximationsverfahren ist jedoch noch eine besondere Bemerkung 
erforderlich. Wir bilden in 3,: 


F41(81, .. +» Bn) 


Lie, .cccts oF 
bees 





Da F,,,; und F, in 3, aquivalent in bezug auf Division sind, ist 1,(2:, . . . , gn) 
in 3, regular; aber es braucht dort nicht eindeutig zu bleiben, da einfacher 
Zusammenhang nur fiir m — 1 der R; vorausgesetzt ist. /,(z,...,2,) kann 
also in 3, Periodizitatsmoduln besitzen, die samtlich ganzzahlige Vielfache 
von 2mi sind. Nach Satz C kénnen wir nun ein Integral 1. Gattung a,(z,,) 
in R,, finden (es sei R,, diejenige Komponente von 3, fiir die einfacher Zusam- 
menhang nicht vorausgesetzt ist), sodass 


b(ma, . « Ba) — CrlSu) = 5.(m, .. - » Sa) 


in 3, regular und eindeutig ist; dabei seien links bestimmte Zweige von /, und 
a, gewahit. Ferner gibt es nach Satz B, eine in 3 regulare Funktion r,(z, 
.» Zn), sodass in 3, gilt. 











160 BEHNKE UND STEIN 


bf, «. «oSu) — TolMa,- <-> fad} < 





konvergiert nun in jedem ganz im Innern von 3 gelegenen Gebiet absolut und 
gleichmAssig nach Abtrennung von je endlich vielen Gliedern; es stellt nach 
der Konstruktion der F,(z,...,2,) eine gesuchte Lésungsfunktion zur ge- 
gebenen Cousinschen Verteilung dar. Damit ist Satz 2 bewiesen. 


Ill. Der Kontinuititssatz. Wir wollen den Kontinuitatssatz in der folgen- 
den Fassung beweisen: 

Satz 3. @ sowie@,, G2, ... mit den Raindern © bzw. ©, Gs, . . . seien Gebiete 
auf zweidimensionalen ergdnzten analytischen Flaichensticken™ § bzw. §1, Fe, - - 
im Raum von n komplexen Verdnderlichen. Die §m mégen gegen § konvergieren, 
derart dass hierbei die@, gegen@ gehen. g(z,...,2n) set reguldr (bzw. mero- 
morph) und eindeutig auf allen §m innerhalb von ©», sowie auf ©. Dann ist 
g(zi,...,2n) auf § im ganz @ hinein vom Rande her regular (bzw. meromorph) 
fortsetzbar und bleibt in einer 2n-dimensionalen Umgebung von & eindeutig. 

Zum Beweise greifen wir eine der § approximierenden analytischen Flachen 
Hm, heraus. Fm, gestattet, nétigenfalls nach einer geeigneten Koordinaten- 
transformation, die Darstellung 


2 = (Zn), y=zl,...,#—l, 


wo die ¢,(z,) auf einer geeigneten gemeinsamen Riemannschen Flache ® iiber 
der z,-Ebene regular und eindeutig sind. Dem Gebiete G,,, nebst Rand G,, 
entspricht auf ® ein Gebiet G*,,, mit Rand G*,,,, der ohne Einschrankung 
der Allgemeinheit—ebenso wie alle €,, und €—als stiickweise glatt angenom- 
men werden darf. 

Neben §m, betrachten wir die Nachbarflachen 


Baul Bar - - > » Ga~a): Z = ¢, (Zn) + >, 7 = ee ts * 
wo die 7, komplexe Parameter seien, die im (2m — 2)-dimensionalen Gebiet 


ny» | < 4, y=zil,...,#—l, 


variieren mégen. z, lauft fiir alle diese §,(m,.-.-,n.-1) auf R. Das Bild 
von G*,,, auf %m,(m,---, M1) nennen wir Ga»,(m,..., mn-1), sein Rand sei 
C..,(m,---» %n—1). Wir denken uns §m, so hinreichend nahe bei § gew4hlt 
und ein geeignetes 5 so festgelegt, dass folgende Bedingungen erfiillt sind: 

(1) Fiir alle (m,..., mn-1) mit |n,| < 6 gehdre Gn,(m, . ~~, m2-1) einer 2n- 
dimensionalen Umgebung 1U1(€) von € an, in der nach Voraussetzung die ge- 
gebene Funktion g(z, ...,2,) regular (bzw. meromorph) und eindeutig ist. 


’Siehe B.—Th. Bericht, 25. 

















1- 
e- 

















ELEMENTARFUNKTIONEN AUF RIEMANNSCHEN FLACHEN 161 

(2) Die Vereinigungsmenge der Punkte aller G,,(m, ... , m2~1) fiir |9,| < 8 
bildet eine 2m-dimensionale Umgebung U1(G) von G. 

Unser Satz ist nun bewiesen, wenn wir zeigen kénnen, dass g(2:,... , Zn) 
auf jedem §m,(m, ---, M1) mit |q,|< 5 in das gesamte Innere des Gebietes 
Ga,(m,---, 1) vom Rande Cn(m,..., %n-1) her eindeutig regular (bzw. 
meromorph) fortsetzbar ist, und dass die so erhaltenen Funktionselemente eine 
eindeutige Funktion in 11(G) ausmachen. 

Wir fiihren diesen Nachweis zunidchst fiir den Fall einer regularen Funk- 
tion g(%,...,2n). Hierzu setzen wir 


F (Zn) = g(¢i(zn), ee | Gn—1(Zn), Sa), 
wo 2, im abgeschlossenen Gebiet G*,,, auf ® variieren mége. f(z,) ist dort 


eindeutig und (in bezug auf die Ortsuniformisierenden) regular. Die gleiche 
Eigenschaft hat 


f(m, +++) Int) Zn) = g(¢i(Zn) + m,- +--+ @n—1 (Zn) + 92-1, Zn) 


fiir |n,|< 4, bei geniigend kleinem 4, mit 0 < 4 < 6, und zwar in bezug auf 
alle m Veranderlichen m,..., 92-1, Zn, da g(z,...,2n) in einer 2n-dimensio- 
nalen Umgebung von §m, als regular und eindeutig vorausgesetzt ist. f(m,..., 
Mn—1, 2n) ist ferner regular und eindeutig fiir | »,| << 6(v = 1,...,9 — 1) und 
z, in einer hinreichend kleinen Umgebung von G*,,, auf ®. Sei nun A(f,z,) 
eine Elementarfunktion 1. Ordnung auf ® gemass Satz A. Fiir |9,| < &, 


v=l1,...,—1, und g, in @*,,, ist dann 
1 
f(m, +++» In—1y Zn) eS =. {rem +++ In—1y f) A(f, Zn) df. 
21 
me 
Die rechte Seite ist aber regular in den Verdnderlichen m, ... , q.~1, 2. fiir 





{\n.|< 5 und zg, in @*,,,}, stellt also dort eine Fortsetzung von f(m... , mn—1) Zn) 
in das Gebiet {| 9,|< 5; 2, in @*»,} dar. Damit ist auch eine Fortsetzung von 
g(%,...,2n) in U(G) gefunden. Ware g(z:, ... , z,) dort verzweigt, so kamen 
als Verzweigungsmannigfaltigkeiten nur Ebenen z, = c in Betracht, namlich 
solche Ebenen z,=c, die iiber Verzweigungspunkten von ® liegen. Dann 
miisste aber g(z:, ..., Zn) diese Ebenen auch in einer Nachbarschaft von G,, 
als Verzweigungsmannigfaltigkeiten besitzen; das kann jedoch nicht der Fall 
sein, da g(z,...,2n) dort als eindeutig vorausgesetzt ist. g(z,...,2%,) muss 
auch bei jeder Fortsetzung innerhalb U(@) eindeutig bleiben. Denn jeder 
geschlossene Weg ist innerhalb 1(@G) in G,,, hinein deformierbar, und langs 
eines geschlossenen Weges in @,, ist g(z:,...,2n) nach Voraussetzung ein- 
deutig fortsetzbar. 

Damit ist unsere Behauptung und der Kontinuitatssatz fiir regulare Funk- 
tionen bewiesen. 


Es bleibt noch iibrig, den ausstehenden Nachweis auch fiir den Fall eines 
meromorphen g(2,...,2n) zu fiihren. 















162 BEHNKE UND STEIN 


Wir betrachten wiederum die Flachen @G,»,(m,..., 92-1): 
S= ¢(n) +7, » =1,...,2 —1,|n] < 5, tp in G*n,. 
Es sei D*,,, ein hinreichend schmaler Streifen langs des Randes €*,,, von 
@*,,, (der evtl. mit G*,,, in getrennte Komponenten zerfallen kann). Dann 
ist nach Voraussetzung 


f(m, «++» In—ly Zn) = g(¢:(Zn) + My +++ Gn—1(Zn) + Na-1, Zn) 


meromorph fiir 
{\n, Ss 6 » nin D*,,} 
= 


und {|n,| < & < 4, z, in @*,,,} . 
Es kann vorkommen, dass f(m, . . . , Mn—1, Zn) fiir bestimmte feste Wertsysteme 
m”,...,%— aus |»,| S 6 als Funktion von z, fiir alle z, aus einer Kom- 
ponente von )*,,, singular wird, d.h. dass ein 2-dimensionales Stiick von 
Gn, (m1, ..., ma) ganz der Polmannigfaltigkeit von g(z:,..., 2.) in dem 
iiber D*,,, gelegenen Teil von 11(G) angehért. Jedoch kénnen wir dieses 
Vorkommnis durch Ubergang zu einer Schar geeigneter Nachbarflachen 
Gay(m, . ++» M%n-1) Stets vermeiden. Genauer: Es gibt ein « > 0 und geeignete 
Konstanten a, mit ja,| < ¢, sodass folgendes gilt: 
(1) Die Flachenstiicke Gm (m, oe 


zz = G>(Zn) + a,.Zn + 1, = $,(Zn) + Ny, | 4 = 3, Zn in G*...; 
y=zil,...,8-l, 











erfiillen eine 2n-dimensionale Umgebung {i(@) von G. 


(2) F (m, -+*9 In—1y 2) = g(G1(Zn) + My +++ Gn—1(Zn) + Nn—1) Zn) 
ist meromorph fiir 


{\n,| <3 ; Zn in D*m}, 
und {|n,| < 3: < 3; z, in G*,,,}. 

(3) Es gibt kein festes Wertsystem ™,..., a,-1. mit |»,| < §, sodass 
F(m, «++» Mn-ty Zn) fiir 9, = 9, und alle z, aus einer Komponente yon D*,, 
singular wird.“ 

Wir denken uns diesen Ubergang zu den Gn(m, ..+»%n-1) ndtigenfalls 


durchgefiihrt und lassen dann die Uberschlangelungen wieder fort. Es darf 
also angenommen werden, dass kein 2-dimensionales Stiick eines G,,, (m, 
-» Mn—1) in dem tiber D*,,, gelegenen Teil von U(@) einer Polmannigfaltig- 
keit von g(2:, ...,2n) angehdrt, und damit auch keiner Mannigfaltigkeit g(x, 
.»2n) = ¢ fiir geniigend grosses c. 
Angenommen nun, g(21, .. . , Zn) sei nicht fiir jedes Wertsystem m, ... , Mn—1 
mit |7,| S 6 ins ganze Innere von G,,(m,...,%.-1) vom Rande C,,(m, 
- » Mn—1) her meromorph und eindeutig fortsetzbar. Unter den Wertsystemen 
m,---,» n-1, fiir die eine solche Fortsetzung nicht méglich ist, gibt es dann 
insbesondere ein solches—etwa 1’), ..., 7',-1—das Haufungspunkt anderer 


Wertsysteme m™, ..+)%n-1” ist, sodass jeweils g(z,...,2,) ins gesamte 


“4Vgl. H. Kneser, Math, Ann. 106, a.a.0. 





a 


wana 3B Te QO @ 





yn 


ne 
n- 
on 
‘m 
eS 
en 
te 





ELEMENTARFUNKTIONEN AUF RIEMANNSCHEN FLACHEN 163 


Innere von Ga,(m™,..., m1”) eindeutig meromorph fortsetzbar ist.’ [Die 
Annahme eines solchen Sachverhaltes kann jedoch zum Widerspruch gefiihrt 
werden. Dies geschieht wie folgt: 

Die Punktmenge 


Dm: {2, = g(t) + 9',; n in D* my} 
enthalt kein 2-dimensionales Stiick, das ganz zur Polmannigfaltigkeit von 
g(zi,..-,%n) gehdrt. Also wird g(z,...,2%,) auf D’,, hdchstens in isolierten 


Punkten singular. Daher lassen sich G*,,,, €*,,, und das an D’», sich an- 
schliessende Streifengebiet D*,,, ndtigenfalls so abandern, dass D’,, (wir 
behalten die Bezeichnungen bei) nur reguldre Stellen von g(z, . . . , zn) enthalt. 
Sodann wahlen wir ein 9':,..., 7'n-1 hinreichend benachbartes Wertsystem 
n*1,..+»7*n—1, das folgende Bedingungen erfiillt: 


(1) Die Funktion 
f(y - +++ Ina Zn) = B(Gi(Zn) + m, ~~ + + On-a(Zn) + Mn» Sn) 
ist meromorph und eindeutig fiir 
{m = 9%,..-5 Mn-a= 9%n-1; Sn in G*m,}, 
(2) f(m,..+- 5 Mn—1) Zn) ist regular und eindeutig fiir 
\ln, — n'>| < 8; tn in D* mg}, 
(3) es gilt |n*, — 9’,| < &, 
(4) f(m,..., Mn—1, Zn) besitzt fiir 
{m1 = 9%, ++ 64 Mma = 2%n-15 Sn in G* ng} 
keine ausserwesentliche Singularitat 2. Art. 


Die Bedingung (4) ist sicher durch geeignete Wahl von 9*;,..., 9*,-, er- 
fillbar, da die ausserwesentlich singularen Stellen 2. Art von g(z:,..., Zn) 
héchstens (2” — 4)-dimensionale analytische Flachenstiicke ausfiillen. 

Ist nun f(m,..., —-1, Zn) fiir {9, = 9*,, tn in @*n,} sogar regular, so ist 
nach dem oben angegebenen Beweis des Kontinuitdtssatzes fiir reguldre 
Funktionen f(m,,.. . 2-1, 2) auch regular und eindeutig fiir {9, = ’,; 2, 
in G*n,}; da dann gleiches fiir g(z:,...,2,) auf Gm,(9'1,..., 9'na) gilt, ist 


der gewiinschte Widerspruch herbeigefiihrt. Andernfalls besitzt f(n*:,..., 
7*n—1, Zn) in e.. wenigstens einen Pol. Es sei dann co eine komplexe Zahl 
mit so grossem Absolutbetrag, dass 
1 

T(r, - - - » Quant) Ba) — Co 

fiir nN — n,'| <= 8’,2, in D*..} regular bleibt, aber fiir 7, = 7*, als Funktion 
von z, Pole in @*,,, aufweist. Insbesondere kann ¢» so festgelegt werden, dass 
die Pole von A(n*,,..., 9*n—1, Zn) = A(z,) innerhalb @*,,, nicht in Verzwei- 
gungspunkte fallen und samtlich einfach sind; ihre Anzahl seir > 0. Nun- 
mehr wird eine Funktion &(m, ..., mn—1, Zn) konstruiert, die fiir { nm — 1'> 
8’, z, in G*,,,} regular ist, nicht identisch verschwindet, aber fiir {|n, — 7’, 
5’ < 8’; zn in @*n,} mindestens an denjenigen Stellen verschwindet, wo 


h(n, «+9 In—Iy Zn) — 





IA HA 











164 BEHNKE UND STEIN 


h(m, ..-+ 92-1, Zn) Pole besitzt, und zwar mindestens von der gleichen Ord- 
nung. Hierzu wird eine in @*,,, regulare Funktion ¥(z,) mit folgender Eigen- 
schaft bendtigt: In den verschiedenen Polstellen von h(z,) nimmt ¥(z,) von 
Null und untereinander verschiedene Werte von erster Ordnung an.—Dass 
ein solches ¥(z,) stets existiert, wird weiter unten nachgewiesen. 

In Analogie zum Vorgange von H. Kneser“ wird jetzt gesetzt 


1 fiv(sn)]?** . fs. (m, .--» tat» Zn) 
u ( peers n—1) - ~ . d 
e - " 2ni(p+1) Ge [f(m, +++» In—ty Zn) i Col? : 





n 


und 


Uo, Ui, ...,U, 


Bim, ees Bn—ty Zn) => * U,, : °* Ue-—1 
1, W(zn),.-- » LW(en)]” 


Dass &(m, . . - , In—1, Zn) die behaupteten Eigenschaften besitzt, ist genau wie 
a.a.0. nachweisbar. 





Wir betrachten nun 


F(m, .- +5 Ina, Zn) = B(m, ~~~ 5 Mn—ty Sn) - A(m, « . - + Mn, Sn) = B/(f — co). 
F(m, .. +» 9n—1) Zn) ist reguldr und eindeutig fiir 
{ln, — 9’,| S 8’, zn in D*n,} und fiir {n, = y*,, zn in G*,,,}, 
nach dem oben gefiihrten Beweis des Kontinuitatssatzes fiir regulare Funk- 
tionen also auch regular und eindeutig fiir { 9, — n’| = 6, 2, in G*,,,}. 
Daraus folgt aber, dass f(m, . . . , Mn—1, Zn) fiir {on = 9',,Z, in G*,,,} meromorph 
und eindeutig ist. Also lasst sich auch g(z,...,2,) in das gesamte 
Innere von G,,,(n'1,..., 2'n-1) vom Rande aus eindeutig meromorph fortset- 
zen, und diese Fortsetzung ist dort nirgends verzweigt, da gleiches sonst auf 
fast allen approximierenden G,»,(n™:, ..., 7.1) eintreten miisste. Damit 
ist der gewiinschte Widerspruch herbeigefiihrt. 

Demnach ist g(z:, . . . , Zn) in das Innere aller Ga,(m, ..., mn-1) mit|9,/S 6 
vom Rande aus eindeutig meromorph fortsetzbar. Dass hierdurch eine in 
U(G) unverzweigte eindeutige Fortsetzung von g(z, ..., Zn) gegeben ist, folgt 
genau wie oben im Falle regularer Funktionen. Damit ist der verallgemei- 
nerte Kontinuitatssatz fiir meromorphe Funktionen bewiesen. 

Es bleibt noch die Existenz der oben bendtigten Funktion ¥(z,) zu zeigen. 
Dieser Nachweis ist enthalten in 

Hirrssatz C. Es sei G ein Gebiet in einer nichtgeschlossenen Riemannschen 
Flaiche R, ferner P, eine héchstens abzdhlbare Folge von Punkten aus © ohne 
Héufungspunkte im Innern von ©. Jedem P, sei ein Ausdruck 

) » (») 
h,(r,) = <1 + ed Hee $F a + 1 2,4... $e1,%. 25", 
, r,* 


Ty, » Ty 








k, 20,1, 20, 





a es 





a2 eh tee oO COD 


ELEMENTARFUNKTIONEN AUF RIEMANNSCHEN FLACHEN 165 


d- sugeordnet; dabei bedeute +r, einen su P, gehdrenden ortsuniformisierenden Para- 
n- meter. Dann gibt es eine in G eindeutige meromorphe Funktion (2) mit folgen- 
on den Eigenschaften: 

Aas (1) Fir p(z) # P, ist ¥(2) regular. 


(2) Fir p(s) = P, stimmt die Laurententwicklung von (2) nach dem ortsuni- 
formisierenden Parameter +, bis zum l,-ten Gliede mit h,(r,) tiberein. 

Beweis. In @ gelten (nach Abschnitt II) beide Aussagen von Cousin. Es 
gibt eine in @ eindeutige regulare Funktion a(z), die jeweils in P, eine Null- 
stelle (/, + 1)-ter Ordnung besitzt. In U(P,) gelte etwa 

a(z) = Bia” ttt 4 Br43” : r,t? PF veet Bia” # 0. 
Wir ordnen jedem P, einen Ausdruck 

















At ! »” At l ” A,” 
* on yptlyt ytty quienes 
h*,(1,) = rbot lt + 7,betle +..-+ Pe 
™ zu, worin die Koeffizienten A_,”’, p = 1,...,k, +1, +1, so fesigelegt seien, 
dass 
ky +in+14_,” © 
> 61) OW, (1,) . a(z) -( _ - )-( > p.” . x.) 
p=1 Ty «=l, +1 
Co). a.” a,” 
E. +. + + ae” +... 4+4:,%.7,% 4... 
Tt,” Ty, 

“ wird; hierdurch sind die A_,” eindeutig bestimmt. Wegen der Giiltigkeit 
ol der ersten Cousinschen Aussage in © gibt es eine in @ eindeutige meromorphe 
rph Funktion b(z), fiir welche jeweils in P, der Hauptteil der Laurententwicklung 
ate nach dem Parameter 7, mit h*,(r,) tibereinstimmt, die ferner in allen Punkten 
“~ p(z) + P, aus @ regular ist. Setzen wir nun 
auf | ¥(z) = a(z) . b(z), 
mit so hat ¥(z) wegen (1) die geforderte Eigenschaft. 
a) 
"°° ’ Minster (Westf.) 
olgt 
nei- i 
yen. 

\ 
shen 
hne | 

) 
T,”, 











\ 





AN EQUATION FOR THE DEGREES OF THE 
ABSOLUTELY IRREDUCIBLE REPRESENTATIONS 
OF A GROUP OF FINITE ORDER 


HANS ZASSENHAUS 


IF there is a nonsingular symmetric bilinear form! f(},a‘C;, 5b*C,) = 5 Dicuna'd* 
defined on a distributive algebra A with basis elements Ci, C2,..., C, over 
a field F such that cy = cy; (¢, 8 = 1,2,...7) and (cy) = (c*), then the 
so called Casimir operator [1,2] 
C= TCC = cc, 

is independent of the choice of the basis of A over F. Here C* is defined as 
usually in tensor calculus by the formula C*’ = }c**C,. What can we say 
concerning C, if F is the rational number field, A the class ring of a group G 
of order N which is embedded in the group-ring S of G over F, r the number 
of classes of conjugate elements of G, C; the sum of the elements of the ith 
class, C, the unity class and finally, with f(X, Y) equal to tr(R(X Y))/N? where 
R denotes the regular representation of S? 

In the multiplication table C:C, = }ocu‘C; the non-negative integer cz,‘ is 
equal to the number of representations of an arbitrary chosen element of the 
#th class as the product of an element of the /th class and an element of the 
kth class; hence it can be easily derived from the multiplication table of G. 
In particular, cy! = 54h; where the &’th class is the inverse of the kth class 
and h; is the number of elements in the /th class. Taking the elements of G 
as a basis of S in order to compute R(C;) we obtain the integral matrices 
R(C:) = (x1,4”) where the row index A and the column index B runs over G and 


B 1,if AB“ € Ci; 

(1) =" Xe, ba.xB = . otherwise. 

Denote the absolutely irreducible characters of G by x', x’, . . . , x” such that 
x' = 1, and denote the value of the kth character of any element of the ith 
class by x;* so that x," = f* is the degree of the kth irreducible representation 
of G. Since C; is represented by a similarity transformation in any irreducible 
representation of G, we derive from these representations the r representations 
D*(Sa‘C,) = (Xa*hix:*/f*) of degree 1 of A leading to the complete reduc- 


r 


tion Rg ~ > (f")*D™ of the representation R, of A induced by R, as is well 


m=1 
known in the theory of representations. By application of the simple rule 
trR(C;) = Né,,; which follows from (1) we obtain 


Received November 30, 1949. 
1The symbol of summation without index of summation means summation over each suffix 
occurring as upper index as well as lower index. 


166 




















IRREDUCIBLE REPRESENTATIONS 


Cx. = trR(C;C,)/N? = KcuitrC,;/N* = cu'tN/N? = hide /N. 
Since h; is a divisor of N it follows that 
(2) c® = 38yN/hi, 
(3) x,” - Lex, 4” 
are integers. Also, the coefficients of the matrix 
(4) R(C) = DVR(C)R(C') = X (1,47) (x'4”) = LLU (m,47x"'4”), 
and the coefficients of the characteristic polynomial of the matrix R(C), i.e. 
det (tEy — R(C)), are all integers. On the other hand it follows from 
Ra~ a (f")*D™ 
and 


D™(C) = D™(LC.C) = D™(LCic4C;) -5 N/h,. D™(C)D™(Cr) 
- (= N/hy « hixi™/f™ . hyxv™/f™) 
= N/(pm)* SS hixy™xe™ = (N/P™) 


that the matrix R(C) is equivalent to the diagonal matrix with the numbers 
(N/f™)* each (f")* times in the diagonal. Hence we have the formula 


(5) det (tEy — R(C)) = ut - (N/f™)2) 0", 


which means that the rational numbers N/f" can be computed by solving an 
equation explicitly known from (1-4) and the multiplication table of G. Since 
the coefficients of that equation are rational integers and the highest coeffi- 
cient is 1, it follows that f” is a divisor of N. 


REFERENCES 


{1] H. Casimir and B.L. van der Waerden, “Algebraischer Beweis der vollstandigen Reduzi- 
bilitat der Darstellungen halbeinfacher Liescher Gruppen,”’ Math. Ann., vol. 111 (1935), 1-12. 

[2] J. C. H. Whitehead, “Certain equations in the algebra of a semi-simple infinitesimal 
group,” Quart. J. Math., vol. 8 (1937), 220-237. 


McGill University 








IDENTITIES AND CONGRUENCES OF THE 
RAMANUJAN TYPE 


K. G. RAMANATHAN 


1. Let P(m) denote the number of unrestricted partitions of the positive 
integer m. Ramanujan! conjectured that 


(1.1) P(n) = 0 (mod 5°7*117) 
if 24m = 1(mod 5*7*117). He also indicated that such congruences could be 
deduced from identities of the type 


~ (1 — x5) (1 — x)... )5 
(1.2) P(4) + P(Q)x +... = 5. =. 
( [((1 — x) (1 — x”)... ]8 


~ G. N. Watson? proved that (1.1) is true for all a if 8 = y = 0 and that it is 
false for 8 > 2 if y = 0. However he established an alternative congruence 
for all powers of 7. It is not known if (1.1) is true for all y. Watson, in the 
same paper, showed that identities of the type (1.2) exist for all powers of 
5 and 7. Recently Rademacher* developed a method which enabled him to 
establish identities of the type (1.2) for all powers of 5,7, and 13. He did not, 
however, use them to prove the congruences of the Watson-Ramanujan type. 

In this paper we make use of a combination of the methods of Watson and 
Rademacher to establish identities and congruences analogous to (1.2) and 
(1.1) for the function 

@ 


(1.3) > P,(n)x* = ((1 — x) (1 — x?)...]7"’, 2» >O, 
n=0 





we prove that the coefficients P.(m) in (1.3) satisfy also identities of the type 
(1.2) for all powers of 5 and 7. The congruence properties of P.(m) are con- 
tained in the following two theorems: 


THEOREM 1. Jf 24m = v (mod 5°) then 














(5 (12, 17, 22, 27) 
a-—l1 

3! z | 15, 20, 25 

sla] 3, 4, 8, 9 
e+1 eee 

(14) Po(m) = 0 (mod IF] )) f=) 16 91 9g (mod 30) , 

(+? 

7 2,7 

0, 5, 10 

5° 1, 6, 11 

\ 4 


[x] being the largest integer in x. 


Received March 21, 1949. The author would like to thank Dr. Paul T. Bateman for his 
suggestions for improvement. 
1Ramanujan, S. Collected papers, Cambridge, 1927. 


168 





gg 








“= ._.5s -—~ @ 





oo @ 


-" oO =& 


Oo. Oe 2” 


ris 








CONGRUENCES OF THE RAMANUJAN TYPE 169 


THEOREM 2. If 24m = » (mod 7*) then 














7 8, 15, 22 
[> ‘] 
, 14, 21 
7{i] 2, 3, 5, 6 

(1.5) P,(m) = 0 (mod { AH I)) 1 ees » (mod 28). 

(s+) ae 
72 1 
7! 0,7 
7 4 








It may be noted that the Watson-Ramanujan congruences are included in 
Theorems 1 and 2. We may remark that by proving an analogue of Lemma 4 
below for p = 13 we can deduce congruences of P.(n) for the modulus 13°. It 
will be seen that the proofs of Theorems 1 and 2 are simple and straightforward. 


2. Throughout this paper » = 5 or 7, and by an integer we mean a rational 
integer. 
Let‘ I'o(p) be the group of unimodular substitutions 





(2.1) gat? 
cr+d 
where a, 6, c, d are integers such that ad — bc = 1 and c = 0 (mod p). It is 
known that I'o() is a subgroup of index p + 1 in the whole modular group 
and its fundamental domain contains two parabolic points r = 0 and r = i. 
All modular functions I'9(p), that is meromorphic functions invariant under 
the substitutions of I'o(p), are rational functions of 





& 
(2.2) $(r) - (22) 
n(r) 
where g = 24/p—1 and (7) is Dedekind’s modular form 
(2.3) n(r) -_ erit/i2 ll (1 aad e™", r=x+ iy, y > 0. 


nm 
The function ¢(r), which is called the Hauptmodul of To(p), has the following 
properties: 
(i) It is single valued in the fundamental domain. 
(ii) It is invariant under all transformations of T'o(p). 


*Watson, G. N., Ramanujans Vermutung iiber Zerfallungsanzahlen, Jour. fir Math., Bd. 179, 
(1938), p. 97. 

*Rademacher, H. The Ramanujan identities under modular substitutions, Trans. Amer. 
Math. Soc., vol. 51, (1942) p. 609. 

‘All these are contained in Klein and Fricke, Vorlesungen iiber die Theorie der Elliptischen 
Modulfunktionen. Bd. 2, p. 64. Also see: L. J. Mordell, Note on certain modular relations con- 
sidered by Messrs. Ramanujan, Darling, and Rogers. Proc. Lond. Math. Soc. (2), 20 (1922) 
p. 408. 

*(r) depends on p. We omit this suffix p in general, but when explicit reference has to be 
made we write »(r). 











170 K. G. RAMANATHAN 


(iii) At r = 4 it has a zero of the first order and the Fourier expansion 
there is 
o(r) =e" +... 
with integral coefficients. 
(iv) It is regular at all points of y > 0 except at r = 0 where it has a pole 


of first order with residue p~ */? measured in terms of the uniformizing para- 
Qui 


metere *. 

Let us call a modular function of I'o9(p) entire modular, if it is regular at all 
points of y> 0, except at r = 0 where it has a polar singularity; then we can 
prove easily that every entire modular function of I'o(p) is a polynomial in 
¢(r), with rational coefficients. We can prove even more as is shown by 

Lemma 1. [If f(r) ts an entire modular function of To(p), whose Fourier 
expansion at r=i@ has coefficients belonging to a module m, then f(r) is a poly- 
nomial in (1) with coefficients belonging to the module. 

Proof. That f(r) is a polynomial in ¢(r) is obvious. Let 
(2.4) fle) = E 9; 066); 

j= 
since ¢(r), at r = i@ has its first coefficient unity, we see on equating coeffi- 
cients of the powers of the uniformizing parameter e’*” in the expansion at 
rT =i, that 


(2.5) @o, a1, @2+J,a;, a3 + Ieae + Jgai,..-, Gn + Tp an—i +... + Ta, 
are all in m (i,..., I, being integers). From the definition of a module we 
deduce that a:,..., a, are in m. 

Coro.iary. If m is the module of integers then a:,...,a, are integers. 


3. We shall prove certain lemmas preliminary to the proof of the theorems. 


LemMA 2. Let q be a prime > 3 and v,y two non-negative integers. Then 


_ {n(qr) pe—3 n(r) ° 
(3.1) My) = (Ze) 2G + 28) ; 


q 
where 
(3.2) (» — w)(p — 1) = 0 (mod 24) 
has the transformation equation 
(3.3) h(r’) -(+) h(r), 
where 
ar+b 
3.4 f/=———., ad—be =1, c =0(modg), 
(3.4) at - +a ( q) 


and (a/q) is the Legendre symbol. 


This lemma includes, as particular cases, Theorems 1, 2 and 3 of Rade- 
macher* and can be proved by using Lemmas 4 and 5 of his paper. 





CONGRUENCES OF THE RAMANUJAN TYPE 171 


LemMa 3. If vis any non-negative integer and yu the least non-negative residue 
of v mod g then 


itt n(r) 4 
(3.5) y(r) = + (x) =( (: + 28) 
P \ a(r) A=0\7 p 


ts entire modular in T'o(p) and is a polynomial in $(r) with integral coefficients. 








Proof. Since p = 5 or 7, g is even. If we put g = p and » = yw (mod g) 
in Lemma 2 we find that y»(r) is invariant under all substitutions of '9(p) . w(r) 
is evidently regular at r = io. To investigate behaviour at r = 0 we make 
the transformation (following Rademacher*) r + —1/r. We have 





_ 1 ( n(—1/7) Y(n(-— p/r)¥ , 1 2S 

38) w(t) a A (MV (Py 4 15, 

ate “ b \n(—1/rp)7 \n(— 1/7) pres 
@-1 


We can prove that the quantity >> is regular at (— 1/r) = 0. The first term 
Aj=1 


on the right of (3.6) could be transformed, by making use of the functional 
equation, 


(3.7) n(—1/r) = (—ir)*n(r) 
into 
(3.8) p~ oe tetD?? ar)" a(r/p)* n(pr)~”. 


Using (2.3) we see that the order of the pole at —1/r =0 measured in terms of 

2rir/? i, v(p?— 1) — (» — w) (p — 1) 
24 

It is easily seen that the Fourier expansion of ¥-(r) at r = i@ has all its co- 

efficients integral so that from Lemma 1 it follows that ¥(r) is a polynomial 

in ¢(r) with integral coefficients. 


It may be noticed that the S, of Watson® and the y¥(r) above are connected 
by the relation 


e 





. Hence y¥(r) is entire modular in T'o(p). 





(3.9) yr) = s.(ueoy, 
n(r) /\ 
where yu is defined by Lemma 3. To calculate ¥-(r) we require only expressions 
for ¥i(r),...,Wpsi(rt) as polynomials’ in ¢(r). We obtain from Watson's 
table of S, the 
LEMMA 4. 
(3.10)* ¥(r) = Lai PY '4(r)*, 
i> 


where a; are integers vanishing for a sufficiently large i. In fact a; = 0 if 
244 > v(p?—1) — (» — w)(p—1). 


*Watson, loc. cit., pp. 106, 119, 120. 


There is another method of obtaining expressions for ¥,(7) as polynomials in (rT). This will 
be published elsewhere. 


8> shall mean summation from i = 1 to a sufficiently large i. 
i>o 











172 K. G. RAMANATHAN 


It is evident that if we expand both sides of (3.10) at r = 0 and equate cor- 
responding coefficients of the powers of the parameter e**""/? we find that the 
coefficients of the powers of ¢(r) in (3.10) are of the above form if i(g—2) 2 
(v + 4). It is only for obtaining this property for all i that we require a table 
of ¥i(r), . . . , Wp41(7) as polynomials in ¢(r). It does not appear to be possible 
to obtain Lemma 4 without the said table. 

It must be noticed that a,’s are integers depending on » and p. 








Lemma 5. 
1S fw) \ 
\ > 
is entire modular in To(p) and 
(3.12) flr) = Dap''o(r)'* © , 
i>0 


where p, a; are defined in Lemmas 3 and 4. 


Proof. This follows from Lemma 3 on using the fact that 
(3.13) flr) = vote) (MP) 
(rT) 
Note that » — uz is divisible by g. 
LemMA 6. Let k = v—p/g, t = $(1—(—1)") ande =1"or pb according as n 
is odd or even, then® 


oe" =1 ’ 
(3.14) RAdbeom FS 1S, 


pb” x=0 7 + 24d 
(5 ) 
is entire modular in T4(p) and has the form *° 
(3.15) F,,.(r) = o(r)** ¥ a(n) p> 9(r)', 
the a;(n) being integers depending on a eel p. 
Proof. Whenn = 1, 














(3.16) F,,,(r) = Wo(r)o(7)* 
and Lemma 5 shows that (3.14) and (3.15) are true. We can write 
(3.16’) Fy, 2) = 5 asd) po(0)*. 
‘> 
Now 
@-1 
G17) Fa) = 45 (42) 
P r»=0 P 
_- | m3, | __a(r) G+ ke 
=F %, 7)? | (: +m) ‘ 
\ > 


*The construction of the function F,,,(7) is suggested by the work of Rademacher, p. 622 
and 624. 
1°To avoid complication we have not shown in a;(m), its dependence on v and p. 





ee 





—— 


= — 


CONGRUENCES OF THE RAMANUJAN TYPE 173 


Applying Lemmas 3 and 4 to the inner sum, we get 
(3.18) Fa.) = E ad(l) pF bss P(r) 
‘> 5 | 


E a/(2) p*9(r), 


where 


a;(2) = 2 a,(1) bi p* 
i>o 


are integers. This proves that Lemma 6 is true for m = 2. We can now 
apply induction. 


Let us assume that for a certain n = 2m 


2m 1 
(3.19) Fn frp el 5 [oO _] 


am ye tT + 24 
mo be] 


= ay a;(2m) p*~*9(r)*, 


where a;(2m) are integers. Then 


; = (: + 2) n(pr) P 
Foam y = = Form ¥ 
am +1, 7) p x, ; p Fe 7 20) 

















p 
= ono n(r) a n(pr) 4 
= F tm) p Ee + 28)) (7 + 28) , 
p p 
that is 
(3.20) Fom41,(7) =o(r)* 2, 2s(2m) *" Wei+(7), 


which shows that from the truth of Lemma 6 for m = 2m we can deduce it for 
n=2m-+1. In a similar way, if we assume for » = 2m + 1 





(3.21) Fom+1,%(T) = 24 (2m + 1) p**(r)***, 
‘> 
then we can deduce that 
(3.22) Fom+1,/(7) = dai (2m + 1) p*—* Wei 4 ne(t) - 
'> 
The induction may now be completed easily. 
LEMMA 7. 
(3.23) F,.(r) =x 1 (lL —x™) & P,(p"l + p)x', 
m= li=0 
where 
(3.24) x =e, 
(3.25) p ts the least positive solution of 24p = v(mod p"), 
(3.26) — 24p — v + ep" 
24p" 


is an integer and « = 1 or p according as n is even or odd, 











174 K. G. RAMANATHAN 


Proof. Using (2.3) we get 
auive(ep” - 1) 


(3.27) F,,, »(r) =e 249” I (1 - guciry > P,(r) A,n(?) grins” 
m=1 r=0 


where 
> 
(3.28) Aja (7) = oe (arin C="), 
Ayn (r) vanishes unless 24r — vy = 0 (mod p") when its value is unity. 
It can be seen that 5isaninteger. For from the definition of p, 24p — »+vep" 
is divisible by »"; furthermore ep" — 1 is divisible by p? — 1 and hence by 24. 
Lemma 8. If c is any positive integer and all the a;(n) in (3.15), 7 > 0 are 
divisible by p* then P.(p"l + p) in (3.23) for all 1 2 O are divisible by p*. 
Proof. This follows immediately on writing down 


(3.29) Fy,» (1) = (r)** 2D asm) po o(r)' 


= x’ LD bax” Y P,(p*l + p)x', 
l=0 


m=0 
where by = 1 and b,, are integers, and using the fact that ¢(r) at r = i has 
a Fourier expansion in x = e’*” beginning with x with the coefficient unity. 
We have merely to compare coefficients on both sides. 

Lemma 8 shows us that for investigating the congruence properties of 
P,(p"! + p) it is enough if we confine ourselves to the a;(m). We must only 
remember that if all the a;(m), i > 0 for a given n, v are divisible by p* then 
P»(p"l + p) for all 1 2 0 are divisible by p*. 


4. Lemmas 6 and 7 establish the existence of identities of the Ramanujan 
type for the function P.(m). For from (3.15) and (3.23) we get 


(4.1) x —- a= ey 2%, Pelort + p)x' = o(r)* 2D astm)pr (er) 


_ xT (1 — xm)kto a,(n)x* II (1 — x?™)*9 
Il (1 — x™)**” ido II (1 — x™)*9 
where a;(m) are integers. 
As an example we take 








’ 





toe) 
(4.2) > P3(n)x*" = [1 — 3x + 5x* — 7x6 + ...)7°. 
n=0 
Here vy = 3. Let p = 5sothat g = 6, € = 5,n = 1,p = 2,¢ = 1,6 = land 
k=0. Then 
911 (1 — x*™) 
4.3 P;,(2) + P(7)x +... =—_————_ 
(4.3) (2) + P3(7) yee 
78. — yom\9 % . R22 — som)\i5 
4 5x II (1 — x5 )° 4. 125 5’x? II (1 xm) 
II (1 — x”) Il (1 — x™)38 


Similarly an identity for the modulus 7 may be derived. 


a —- — — 


ee 


—— 


CONGRUENCES OF THE RAMANUJAN TYPE 175 


5. Weshall now prove congruence properties of P,(m). Because of Lemma 
8 it is enough if we find congruence properties of a;(m). 

Let now » be a fixed integer. Let »** for a given m be the highest power of 
pb which divides all a;(n) for i > 0. We shall study now the sequence 
Au, Ax, .... Consider (3.20); 


Fom +1,» (7) = (7)* 24 (2m) p** _i +» (7) 


(5.1) = o(r)* 2 as (2m + 1) p’* (1). 
Let 

(5.1’) Vi+,(7) = ~~ bis(i) p> (7). 
Substituting in (5.1) we get 

(5.2) a;(2m + 1) = 


XL dist) p** a; (2m). 
i>o 


In a similar way from (3.22), if 


(5.1”) Vei+ng(7) = 2d Malt) p o(r)’, 
then 
(5.3) a; (2m + 2) = ¥ a; (2m + 1) d'4;(%) p*. 


i>0 
All the quantities involved in (5.2) and (5.3) are integers and hence p**" divides 
all a;(2m + 1) and p"*1 divides all a;(2m + 2) so that 


Aom+2 2 Nom+1 2 om . 


Hence 

(5.4) Mm SA SASS....- 

Now (5.2) may be written 

(5.5) a; (2m + 1) = bij(1) a1 (2m) + Pd di (i) p*~*a,(2m). 


The second term on the right is divisible by p-p" = p'**™. The first 
term b,;(1)a;(2m) is in general only divisible by p**". We cannot say anything 
regarding a,(2m). But if b,;(1) is divisible by p for all 7 > 0 then a;(2m +41) 
is divisible by »'**™ for all 7 > 0. That is, 


Nomi 2 1 + Dom. 


From (5.1’), b:;(1) are coefficients in ¥,,,(r), considered as a polynomial in 
o(r). Thus 


(5.6) If all the coefficients b,;(1), 7 > 0 in 
Ve+e(r) = LD dy(1) P** (7)? 
j>o0 
are divisible by p, then dNoma1 2 1 + Aam- 
In a similar way we derive from (5.3) and (5.1”) that: 
(5.7) If all the coefficients b’;;(1), 7 > 0 in 
Ve+r—y(7) = DM) Pr 
j> 


are divisible by p, then Nems2 72 1 + Aom4i- 











176 K. G. RAMANATHAN 


It must be noticed that the conditions (5.6) and (5.7) are only sufficient; 
they are by no means necessary. It can be seen also that the conditions (5.6) 
and (5.7) depend only on the parity, and not on the actual value of n. 

The conditions (5.6) and (5.7) enable us to divide the sequence (5.4) into 
four categories. 


(i) Ari <Ae<As<... ((5.6) and (5.7) both hold good). 
(ii) A Sa < ass... ((5.6) but not (5.7) holds). 
(iii) A <A QL As<... ((5.7) but not (5.6) holds). 
(iv) A Save QaAsS... ((5.6) and (5.7) don’t hold). 


We shall examine these cases and see what consequences they lead to with 
regard to congruence properties of the a;(m). 
Case (i). Since (5.6) and (5.7) hold good, 
(5.8) An 2 A + (m — 1) 
and p|by; and p|b’1;, 7 > 0. 
Let p = 5 then g = 6 and Watson's table® shows that if 


(5.9) s = 1, 2, (mod 5), 
then a,;(1) = 0 (mod 5), 
where ¥3(r) = ¥ ayj(1) 5’ G(r)’. 
j>o 
Hence for our conditions (5.8) we obtain 
vy = 0, 1 (mod 5), 
: } » — » =0, 1 (mod 5), 
(5.10) 0O< »p <6, 
vy = uw (mod 6). 





The » satisfying these conditions are vy = 0, 1, 5, 6, 10, 11 (mod 30). Using 
(5.9) we see that A, 2 1, if » = 1, 2 (mod 5) and therefore if » = 1, 6, 11 
(mod 30); otherwise 4; 2 0. Now a,(n), i > 0 are divisible by p**( = 5**) 
and hence by (5.8) they are divisible by 5*~'*™. We may summarize these 
in the conclusion, for 7 > 0: 


(5.11’) a,(n) = 0 (mod 5”), vy = 1, 6, 11 (mod 30); 
(5.11’’) a;(n) = 0 (mod 5"-'), =v = 0, 5, 10 (mod 30). 
In exactly a similar way if p = 7; g = 4 and Watson’s table shows that if 
(5.12) r = 1, 4 (mod 7), 
then a’,;(1) = 0 (mod 7), 
where ¥.(r) = 2 a’s;(1) 77—* o2(r)’. 
Hence, for i > 0: 7” 
(5.13’) a;(n) = 0 (mod 7”), vy = 4 (mod 28); 
(5.13”) a;(n) = 0 (mod 7*~"), 1» = 0,7 (mod 28). 


Case (ii). (5.6) holds but not (5.7). Hence 


Aengt 2 n+, 
5.14 
( { Aon 2 (nm — 1) + ro. 


—— 


_— > 





m OY 


@ 





- 








CONGRUENCES OF THE RAMANUJAN TYPE 177 


Using (5.9) we see that (5.14) holds good if 
vy = 0, 1 (mod 5), 
5 1 v — p #0, 1 (mod 5), 
6.19) y = u (mod 6), 
OS p<6. 
Ae 2 Au, for otherwise we will be in Case (i). It can be easily seen that 
\. 2 Lif » = 16, 21, 26 (mod 30) and dA, 2 0 if » = 15, 20, 25 (mod 30). 
Hence 


an+i1 
(5.16’) a(n) =0 — sl): 7 


16, 21, 26 (mod 30); 


n-—1 
(5.16’") a(n) =0 (nod sz), v 
Case (iii). Since (5.7) but not (5.6) holds 


| densa 2 + -|*+ tl4o, 
as ial 2 
(5.17) 4 


| Aan >ntn=[*1+n, 


since 4» 2 A, + 1. Also 

- # 0, 1 (mod 5), 
5.1 : 

(5.18) » — » = 0, 1 (mod 5). 


Exactly as before \; 2 1 if » = 2, 7 (mod 30) and A, 2 0 if » = 3, 4, 8, 
9 (mod 30). Hence 


15, 20, 25 (mod 30). 





fot? 
(5.19") a;(n) 0( mod 5 2 ), vy = 2,7 (mod 30) ; 


0 (mod sla) , v = 3, 4, 8, 9 (mod 30). 
Case (iv). This merely asserts 
(5.20) 2 d. 


An 
The only interesting case is \: 2 1 which happens when »v = 12, 17, 22, 
27, (mod 30). Therefore 


(5.21) a;(n) = 0 (mod 5), v = 12, 17, 22, 27 (mod 30). 


(5.19°") a;(n) 


We have omitted consideration of » = 7 since it runs exactly parallel to the 
above and can be easily completed. In case vy = 1 we get the Watson-Rama- 
nujan congruence properties of the partition function. It is seen that for 
pb = 5, (5.6) as well as (5.7) hold good so that 
(5.22’) P(m) = 0 (mod 5*), 24m = 1 (mod 5°), 
as then we are in Case (i). But if p = 7 (5.6) does not hold good as g =4, 
vy = land 4 + 1 #1 or 4 (mod 7); whereas (5.7) holds good. We are there- 
fore in Case (iii) and hence 


a+2 
(5.22”") P(m) 0 (mod AFI), 24m = 1 (mod 7°). 











178 K. G. RAMANATHAN 


In (5.21) we did not consider \; = 0. This will merely mean that for » = 13, 
14, 18, 19, 23, 24, 28, 29 (mod 30), in general, 
(5.23) a,(n) # 0 (mod 5). 

Our results (5.22’) and (5.22”) for » = 1 are Hauptsdtze 1, 3, 4 and results 
(5.45) and (5.46) of Watson." It is not difficult to obtain analogues of Wat- 
son’s Hauptsdtze 2 and 5 for P,(n) from our foregoing results. 


6. We may finally make a remark about the case p = 13. The group 
To(13) has all the properties of I'9(5) and T'(7), and Klein and Fricke have 
shown that (n(13r)/n(r))? is a Hauptmodul of T,(13). Further Lemmas 1, 2, 
and 3 hold good. We have only to prove an analogue of Lemma 4 by construct- 
ing a table of ¥(r),... Yua(r) as polynomials in (9(13r)/n(r))*. Here y(r) 
is given by 


: 7 n(13r)\* 32 n(r) , 
(6.1) 13y(r) -( n(r) ) z(G + 28)) ¥ 


0 < p < 2,» =yu(mod 2). Zuckerman” has found an expression for ¥:(r7) 
as a polynomial in (n(137)/n(r))*. The general case of y(r) offers no particular 
difficulty except that of computation. 








"Watson, loc. cit. p. 98, 99 and 124. 
"Zuckerman, H. S., Identities analogous to Ramanujan's identities involving the partition 
function, Duke Math. Jour., vol. 5 (1939) p. 98. 


Institute for Advanced Study 
Princeton, N.J. 


ee 


sy _3 oo «se 





on 





~ 





TTT 


A CLASS OF SELF-DUAL MAPS 


C. A. B. SMITH anp W. T. TUTTE 


1. Introduction. A dissection of a rectangle R into a finite number n of 
non-overlapping squares is called a squaring of R of order n. The m squares 
are called the elements of the dissection. If there is more than one element 
and the elements are all unequal the squaring is called perfect and R is a 
perfect rectangle. (We use R to denote both a rectangle and a particular squar- 
ing of it). If a squared (perfect) rectangle is a square we call it a squared 
(perfect) square. 

In the course of an investigation of squared rectangles it was found that the 
theory reduced to that of certain “flows of electricity’’ in networks (linear 
graphs) on the sphere. An account of this work has been given elsewhere ([1}). 
The connection between squared rectangles and electrical networks is dis- 
cussed later on in the present paper. 

We have observed that the methods for the construction of a perfect square 
briefly described in [1] depend on the properties of networks of a particular 
kind. The characteristic property of a network of this kind is that the map 
on the sphere which it defines is combinatorially equivalent to its dual map. 
For this reason we have made an investigation into the properties of such 
“self-dual’”’ maps. We give our results below. 

Apart from the connection of self-dual maps with squared squares, some 
of them give rise to a particularly interesting class of perfect rectangles. These 
rectangles are discussed at the end of sec. 5. 

A detailed discussion of the problem of constructing a perfect square is 
given in a companion paper by one of us. 

Before going on to the study of self-dual maps we collect some results on 
electrical networks in general which will be useful later. 

Let N be a connected network whose vertices are Pi, Ps,...,P, (n22). 
The 1-cells are called wires; there may be more than one wire joining two 
vertices, and there may be wires whose two ends coincide. With each wire 
is associated a non-zero real number, its conductance. In [1] all conductances 
are positive. In the present paper also we are only interested in positive con- 
ductances; but negative conductances are employed in the companion paper 
by Tutte. We define a matrix {c,,} as follows. 

lf r xs, 
fsum of conductances of all wires joining P, to P,. 

(0 if there are no such wires. 
Crr = sum of conductances of all wires joining P, to other vertices. 


—Cre = 





Received March 18, 1948. 
179 











180 SMITH AND TUTTE 


Thus 
(A) Cre = Car; P Sis => 0. 
r 


From (A) we can readily show that all the first cofactors of {crof are equal. 
We call their common value the complexity of the network, and denote it by C. 
It is known that C > 0 when all the conductances are positive. (There is a 
proof of this result ‘n [1].) 

The second cofactor obtained by taking the cofactor of the element c,,, in 
the cofactor of c,; in {cre} is denoted by (rs.tu). (If m = 2, (12.12) = 1 
= — (21.12).) We put (rr.tu) = 0 = (rs.tt). We call the (rs.tu)’s the trans- 
pedances of N. 

We also write the transpedance (rs.tu) as (P,P ,.P:P.). 

Consider a flow of current from P, to P, (the poles). The currents in the 
wires then satisfy (except at the poles) Kirchhoff’s Laws, which we state as 
follows. 

(i) The total current flowing into P; is zero. 

(ii) The algebraic sum of the EMF’s round any circuit is zero. 

The EMF in a wire in the direction of the current may be defined as the 
current in the wire divided by the conductance of the wire. The EMF in the 
opposite direction is the negative of this. If (ii) is satisfied for all circuits we 
may associate a potential v, with each node P, so that the EMF in a wire with 
ends P; and P; in the direction P; to P, _ 0; — 0;. 

It is known that these conditions de -rmine the flow uniquely when the 
total current J (flowing in at P, and out at P,) is given and the conductances 
are all positive ((2], 324-331). 

Then the fall in potential from P, to P, is given by 

(xy.rs)I 


B) — 
( Cc 


It is convenient to take J = C, thus fixing the values of the currents and 
potential differences of the network. The flow with J = C is called the full 
flow; we speak of “full currents’, etc. 

From the definition of a transpedance it follows that 


(C) (rs.tu) = (tu.rs) = — (sr.tu). 

Using (B) we may restate Kirchhoff’s Laws for the full flow as 
(D) Yee: (rs.tx) = C (8:5 — ber), 
(E) (rs.tu) + (rs.uv) = (rs.tv). 


The function 6,, is equal to 1 if r = s, and to 0 otherwise. 
Another general property of transpedances is the following: 
(F) C divides (rs.rs)(tu.tu) — (rs.tu)?* 
(for integral conductances). 
To prove this we use Jacobi’s Theorem on determinants ({3], p. 98). This 
states that if A is a determinant, A;; the cofactor of the element a;; of A, 








~——e. —- 


ee 





» en a a oe ee ee ee 


= 8s J & 


nd 
ull 


“his 
[ A, 





SS 





A CLASS OF SELF-DUAL MAPS 181 


and A»,,,. the determinant obtained from A by striking out the pth and gth 
rows and the rth and sth columns, then 
A Ave, rs = AprA — Ay A qr: 
If we apply this result to the determinant X which is the minor of the element 


cis in the matrix {c,,}, and if we assume that all the conductances are integers 
we find that 


(G) C divides (ip.jr)(ig.js) — (ip.js)(ig.jr). 

This proof assumes that p # g, r # s and that X has at least three rows. 
But (G) is trivially true when one of these conditions is not satisfied. It is 
also trivially true when i = por g,orj = rors. It is thus a general property 


of transpedances. If we replace p by p’ in (G) and then subtract (G) from the 
resulting formula we obtain the result 


(H) C divides (pp'.jr)(ig.js) — (bp’.js)(ig.jr) 
by (E). Next we replace g by q’ in (H) and then subtract (H) from the result- 
ing formula. After four operations of this kind we have the result 


(1) C divides (pp’.rr’)(qq’.ss’) — (pp’.ss’)(qq'.rr’), 
where P,, Py», etc. are any vertices of N. (F) is a special case of (I). 


2. Self-dual maps. We define a map as a dissection of the surface of a 
sphere into a finite number of simple polygons P;, P2, ..., P,», called faces. 
The boundary of each face is a simple closed curve, subdivided by a finite 
number 2 2 of points called vertices into simple arcs called edges. It is sup- 
posed that 

(i) No two faces have any interior point in common. 

(ii) Each edge is common to just two faces. 

(iii) Each vertex is a vertex of every face in whose boundary it lies. 

(iv) The union of the faces, edges, and vertices is the whole sphere. 

We shall speak of vertices, edges, and faces collectively as cells. 

A cell will be said to be incident with any cell which is contained in its boun- 
dary or in whose boundary it is contained. (The boundary of an edge is its 
pair of end-points.) 

The vertices and edges of a map constitute a network or linear graph which 
we call the 1-section of the map. 

Two maps M, and M; are combinatorially equivalent if there is a 1-1 corres- 
pondence f between the set of cells X of M, and the set of cells {(X) of Mz such 
that f(X) is a vertex, edge, or face according as X is a vertex, edge or face, and 
such that f(X) is incident with f( Y) if and only if X is incident with Y. We 
call the correspondence f a combinatorial equivalence. 

We shall be interested in a regular subdivision Z(M) of the map M. We 
define this as follows. In each edge W; of M we select just one interior point 
of this edge, and denote it by w;. In each face P; we select just one interior 
point which we denote by P*;. We subdivide P; into triangles by joining 
P*; to each vertex and each point w; in the boundary of P;. The resulting 











182 SMITH AND TUTTE 


map is Z(M). Its vertices are the vertices of M together with all the points 
w,; and P*;, its edges are the simple arcs into which the points w; divide the 
edges of M and the joins made from the points P*;, and its faces are the tri- 
angles into which the faces P; are subdivided. Since its faces are triangles it 
is called a triangulation or simplicial dissection of the sphere. 

As consequences of the method of construction we see that 

(i) Each face of Z(M) is incident with one vertex of M, one of the points 
w,; and one of the points P*;; 

(ii) Each vertex w; of Z(M) is incident with just four edges. 

Let us enumerate the vertices of M as Vi, V2,..., Vn. Consider the union 
of all the faces of Z(M), with their boundaries, which are incident with V;. 
The boundary of this set is a simple closed curve. We denote the set by V*;. 
As a consequence of this result we can define a new map M* as follows: 

(i) The vertices of M* are the points P*;. 

(ii) For each point w; we denote by W*; the union of the two edges of Z(M) 
joining w; to points P*,. The edges of M* are the arcs W*;. 

(iii) The faces of M* are the sets V*;. 

It is easily verified that M* satisfies the definition of a map. It is a dual 
map of M. 

We observe that the combinatorial structure of M* (given by the incidence 
relations) is fixed by that of M, that Z(M) is a regular subdivision of M*, and 
that M is a dual map of M*. 

A map M is self-dual if there is a combinatorial equivalence f transforming 
M into M*. 

An edge W; of a self-dual map M is self-dual under the combinatorial equi- 
valence f transforming M into M* if f(W;) = W*;. 

For any combinatorial equivalence f transforming a map M, into a map M2, 


let Y be a point set which is the union of a set of cells Xi, Xe, ..., X, of Mi. 
Then we define f( Y) as the union of the set of cells f(X1), f(X2), ... , f(Xx) of 
M2. 


THEOREM 1. Let f be a combinatorial equivalence transforming a map M 
into its dual M*. Then there is a combinatorial equivalence f, transforming Z(M) 
into itself such that if X is any cell of M, f.(X) = f(X). 

For vertices of Z(M) we define f, as follows: 

(i) If V; is a vertex of M, f.(V;) = f(V3). 
(ii) If W; is an edge of M, and f(W;) = W*;, then f,(w;) = w,. 

(iti) If Py is any face of M, and if V, is the vertex of M which is contained 
in f(P,;), then f,(P*,) = V¢. 

Consider a face F of Z(M) with incident vertices V;, w;, P*,. f(Px) is a 
face of M* whose boundary contains f(W;) and its end-point f(V;). Hence 
just one of the triangles into which f(P;) is subdivided has the vertices f,(V;), 
fAw,;) and f,(P*,). So just one of the faces of Z(M) has these three vertices. 





eg ee 








ee re 








i. 














A CLASS OF SELF-DUAL MAPS 183 


We take this face as f,(F). If G is any edge of F, with end-points A and B, we 
define f,(G) as the edge of f,(F) with end-points f,(A) and f,(B). This defines 
f.(G) uniquely, for it is clear that not more than one edge can join two given 
vertices of Z(M). It follows that f, is a 1-1 correspondence, and that it 
preserves incidence relations. 


THEOREM II. Let f, be defined as above. Then there is a positive integer n, 
and a homeomorphism H of the sphere onto itself such that H" = I (the identical 
mapping) and such that H(X) = f,(X) where X is any cell of Z(M). 


We begin by making a further subdivision of Z(M). For each edge E, 
having one end a vertex of M and the other a vertex of M* we select just one 
interior point e,, We then subdivide each face of Z(M) into two triangles 
by making a join from its vertex w; to the opposite point e¢,. 

Each face of the new map Z’ is a triangle incident with one vertex either of 
M or M*, one vertex w; and one vertex e,. (We take the vertices of Z’ to 
be the vertices of Z(M) together with the points e,.) 

Clearly there is a combinatorial equivalence f’ transforming Z’ into itself 
such that if X is any cell of Z(M), f’(X) = f,(X). (We take f’(e,) to be e,, 
where f,(Z,) = E,.) 

The correspondence f’ has the property that if any iteration (f’)” of f’ trans- 
forms the cell X of Z’ into itself, then (f’)™ transforms every cell of Z’ in the 
boundary of X into itself. For no iteration of f’ can map a vertex V; or P*; 
onto a vertex w;, or either onto an e,. 

Since f’ is a 1-1 correspondence, if X is any cell of Z’ some iteration (f’)™ 
of f’ will map X into itself. (The number of cells of Z’ is finite.) The least 
positive integer m for which this is so will be denoted by r(X). 

In constructing the homeomorphism H we use the topological theorem 
that any homeomorphism of the boundary F(X) of an n-simplex X into the 
boundary F(Y) of an m-simplex Y can be extended as a homeomorphism of 
X onto Y. We begin by defining H for the vertices v of Z’ by H(v) = f’(2). 
If X is any edge of Z’ this gives usa homeomorphism H of F(X) onto F(f’(X)) 
which we can extend as a homeomorphism of X onto f’(X). Actually if 
x(X) = 1 we define the extension of H to X as the identity mapping. This 
is possible since H does not interchange the points of F(X). If #(X) > 1 we 
define the extension of H to X, f’(X), ... , (f’)** ~* (X) as we please, and then 
define it for r= (X) by postulating that H*™™ reduces to the identity 
mapping in X. The extension of H to the faces of Z’ is analogous. From 
this construction it follows that H is a homeomorphism of the sphere onto 
itself which satisfies H" = I, where n is the L.C.M. of the numbers 7(X). 
Moreover, if X is any cell of Z(M), H(X) = f’(X) = f.(X). 

Now it is known that any homeomorphism of the sphere onto itself is 
topologically equivalent to a rotation, a reflection, or a rotation followed by 
a reflection ((4]). Any self-dual map M is topologically equivalent therefore 
to one which is transformed into its dual by one of these three operations. 











184 SMITH AND TUTTE 


3. Dual flows. Consider an edge W; of any map M, not necessarily self- 
dual, with ends V,, V, and incident faces P,, P,. We orient W; by specifying 
one end, V, say, as its positive end and the other as its negative end. 

Since each face P, has a boundary which is a simple closed curve we may 
orient P, by specifying a particular sense of description of this curve, (clock- 
wise or anti-clockwise). For this it is enough to give the cyclic order of the 
vertices of Z(M) in the curve. 

We say that W; is positively or negatively incident with an incident face P, 
according as its positive end immediately precedes or immediately succeeds w; 
in the chosen cyclic order of vertices of Z(M) in the boundary of P,. 

From now on the symbols W;, P, will denote edges or faces taken with 
some fixed orientation. For the same edges or faces taken with opposite 
orientation we use the symbols — W; and —P,. 

We shall in fact take for the orientation of each face P, or V*; that cyclic 
order of vertices which agrees with some fixed positive sense of rotation about 
simple polygons on the sphere. We can think of this sense as the clockwise 
direction as seen from the centre of the sphere. It is evident that W; is posi- 
tively incident with one of its incident faces of M and negatively incident with 
the other. 


Ficure 1 


Suppose the edge W; considered in the first paragraph of this section is 
positively incident with P, and negatively incident with P,. Then we define 
the orientation of W*; dual to that of W; by taking P*, as its positive end 
and P*, as its negative end. Fig. 1 shows the state of affairs in the region 
defined by the four faces of Z(M) which meet at w;. The orientations of 
W; and W*; are indicated by arrows directed from positive to negative ends. 
The curved arrows show the positive sense of rotation. From this figure we 
observe that 

(i) If W; is positively (negatively) oriented with respect to P,, then P*, 
is the positive (negative) end of W*;; 

(ii) If V, is the positive (negative) end of W;, then W*; is negatively 
(positively) oriented with respect to V*,. 








os ae wy -_ 


a 








A CLASS OF SELF-DUAL MAPS 185 


We shall now consider the l-sections of M and M* as electrical networks 
in which every “wire’’ (edge) has conductance 1. Consider any distribution 
(flow) F of currents in M. We denote the current in W; from positive to 
negative end by J;, so the current in the opposite direction is —J;. If P, is 
any face of M we denote by E(P,;) the sum of the currents in the edges of P,, 
an edge W; contributing J; or —J; to this sum according as it is positively or 
negatively oriented with respect to P,. 

We define the dual flow F* in M* by taking the current in W*; from positive 
to negative end to be J;. Then from (i) and (ii) we find 

(iii) E(P,) = (algebraic sum of the current from P*, in the incident edges 
of M*) and 

(iv) E(V*;) = —(algebraic sum of the currents from V; in the incident 
edges of M.) 

Now the full flow in the 1-section of M whose positive and negative poles 
are the positive and negative ends respectively of W; will be called the full 
flow in M with polar edge W;. The current J, in this flow is denoted by the 
““‘transpedance” (W;.W;). Then using (C) we have 


(1) (W;.Wi) = (Wi.W;) = —(W;.(— W;)). 
THEOREM III. For any edges W; and W;, of a map M, 
* 
(W*;,W*,) =— M (W;.Wi) iff #k, 
Cc 
(2) an 
(W*;.W*;) = C* — o (Ws-W)), 
where C is the complexity' of M and C* is the complexity of M*. 


Consider the full flow in M with polar edge W;. The total current flowing 
from the positive end of W; is C, by (D). Let F be the flow obtained from 


* 
this by replacing each J, for which k # j by I,, and replacing J; by 
c* 
-Z(C- I. 
Then F satisfies Kirchhoff’s Laws at each vertex and in the boundary of 
each face, save only for the two faces incident with W;. Consequently F* 
satisfies the Laws in the boundary of each face and at each vertex not incident 


with W*;, by (iii) and (iv). Now if P, is the face of M with which W; is 
positively incident we evidently have 


E(P,) = (—(W;.W;) + (—(C — (W;.W)) )) Ss — 


for the flow F. It readily follows, using (iii) and (iv), that F* is the full flow 


in M* with polar edge — W*;. The Theorem follows. 


1More precisely we should say the complexity of the 1-section of M. It is clear that this 
l-section is connected. Hence we can suppose C>0. (See sec. 1). 











186 SMITH AND TUTTE 


CoroLitary. If M and M* are combinatorially equivalent we may replace 
the above results by 


(W*;.Ws) 
@) lwo 


—(W;.W,) ifj ¥k, 
C — (W;.W)). 


For then C = C*. 

As a matter of fact, C* is equal to C for all maps M, so that equations (3) 
are of general application. There is a proof in [1] that C* = C for all maps. 
In this paper we shall be concerned mainly with the case in which M and M* 


are combinatorially equivalent, and we shall not need a proof of the general 
theorem C* = C. 


4. Reflexes. Let W; be any edge of a map M, and V, the positive end of 
W;. Then (by (i) and (ii) of sec. 3) we find first that W*; is negatively oriented 
with respect to V*, and thence that ((V,)*)* = V, is the negative end of 
(W*;)*. Now (W*;)* is an oriented edge of M which contains w;, and so we 
have 

(W*;)* = —Wj. 

We return to the case of a self-dual map M transformed into its dual map 
by an operation ¢ which is either a rotation, a reflection, or a rotation followed 
by a reflection. 

Now it is easily verified that if we define a positive sense of rotation on a 
sphere, then the effect of a rotation is to map any positively oriented simple 
polygon onto another positively oriented simple polygon, but the effect of a 
reflection is to map positively oriented simple polygons onto negatively 
oriented ones. As a consequence of this and the definition of duality, we have 

(i) If @ is a reflection, or a rotation followed by a reflection, then for each 
edge W; of M 
$(W*;) = —(¢(W)))*, 
and 
(ii) If @ is a rotation, then for each edge W; of M 
(W*;) = (¢(W;))*. 
Now if W; is any edge of M, ¢"'(W*;) is also an edge of M. We denote it 


by W;. We say that M is a reflex with respect to ¢ if it has more than two 
edges and satisfies 


W; = Wj. 

Now in order that M shall be a reflex with respect to ¢ it is evidently neces- 
sary that ¢7(w,;) shall be equal to w; for each w;. Whether we are dealing 
with case (i) or with case (ii), ¢? must be a rotation (it must map positively 
oriented simple polygons onto positively oriented ones). As M has at least 


three points w;, ¢’ is a rotation which leaves three distinct points w; invariant. 
Hence ¢* = I, the identity mapping. 


—_—_— —- 


ss 


it 


A CLASS OF SELF-DUAL MAPS 187 


We define ¢ to have the value —1 in case (i) and +1 in case (ii). Then 
since ¢? = I we have in either case 
W; = #(W) = #(O((O(W*))*) = 0((O™U(W*)))") 
= ¢.(o(¢"(W*;))* = «.(W*;)* 
= —e. Wj. 

Thus M is a reflex with respect to ¢ in case (i) but not in case (ii). 

We deduce that there are essentially only two different kinds of reflexes, 
those in which ¢ is a reflection in a plane through the centre of the sphere, 
which we call planar reflexes, and those in which ¢ is a reflection followed by 
a rotation (through an angle not 27) which we call central reflexes. (It is easily 
verified from the relation ¢@ = J that for a central reflex ¢ is a rotation through 
an angle + about an axis through the centre of the sphere followed by a 
reflection in the plane through the centre perpendicular to that axis. ¢ is 
thus a “reflection in the centre,”’ transforming each point of the sphere into 
its diametrically opposite point.) 


THEOREM IV. Let W; be any edge of a reflex. Then if W; is not self-dual, 
(W;.W;) = 0. 


For then 
(W;.W;) = —(W*;.W*)), Theorem III, corollary; 
= —(oW* ;.oW*;), symmetry of Z(M); 
= —(W;.W)) = —(W;.W), definition of a reflex; 
_ (W;.W;) by (1). 


Consider any reflex M. Let m be the number of its vertices. Then m is 
also the number of its faces by the symmetry of Z(M). By the Euler poly- 
hedron forinula it follows that the number of its edges is 2m — 2. Let the 
vertices, edges and faces be enumerated as V;, V2,..., Vm, as Wi, We, .. 
Wem—2, and as P;, Po,..., Pm respectively. 

The structure of the l-section of M can be represented by its incidence 
matrix H; = { mij}. Here 7;;' is +1, —1 or 0 according as V; is the positive 
end of, the negative end of, or not incident with, Wj. 

Let K be the matrix {c,,} defined in sec. 1, for M. 

Then 
(4) K = HH"; 
where H’, denotes the transpose of H;. For the (4,h)th element of 1,77’, is 
the sum of the squares of the elements of the Ath row of Mj, that is the num- 
ber of edges incident with V,. And the (h4,k)th element (h # k) is evidently 
—J where J is the number of edges which join V, and V,. 

The incidence matrix Hz; = { nya} is defined as follows: 9,’ is +1, —1 or 0 
according as W; is positively incident, negatively incident, or not incident 
with P,. By elementary combinatorial topology we have ((5], p. 68) 

(5) HH: = 0. 











188 SMITH AND TUTTE 


Now we divide the edges of M into four disjoint classes: S,, S:, S;, Sy. 5S; is 
the class of all self-dual edges? W; which satisfy W; = Wj, Ss is the class of all 
self-dual edges which satisfy W; = —W,, and finally the non-self-dual edges 
are partitioned among S; and S, in such a way that W; belongs to S, when W; 
belongs to S;. Let » denote the number of members of S,, g the number of 
members of S:, r the number of members of 5S; and therefore also of Sy. 
With a suitable ordering of the edges of M we can partition the matrix 1; 
as follows 
(6) H, = {L|L)LJL,}. 

Here L; is the submatrix of H, defined by the columns corresponding to mem- 
bers of S;. We imply by (6) that the edges of M are so ordered that the 
columns of ZL; come first in H;, then those of L2, and so on. 

We now obtain a similar expression for H,. We retain the same order for 
the edges, and we take the ith row of H, to correspond to the face ¢V*;. The 
edge W; is positively incident, negatively incident, or not incident with ¢V*; 
according as W*; is negatively incident, positively incident, or not incident 
with V*; (since @ reverses the orientation of a face), that is according 
as V; is the positive end, the negative end, or not an end of W; (by sec. 3, 
prop. (ii)). Hence 
(7) Hz = {L,| —L,|L,|L,}’. 

By (4), (5), (6) and (7) we have 


K = LL’; + LL’s + LsL's + Lal's, 
0 = LL’; -_ LL’: + LL’, + L,L’;, 
whence 
(8) K = 21,L'; + [Ls + L,|[Ls + L,) 
(9) = 2L.L’. + [Lz — Lal[Le — Ly)’. 
We can write (8) and (9) in the forms 
(10) K = (V2 Lil[L. + Li (V2 LillLs + Li)’, 
(11) K = (V2 L[Ls — Li) (V2 L|[Ls — Lil)’. 


Each of these expresses K as a product of a matrix with its transpose. 

Now the rank of K cannot exceed that of (4/2 L,|[L; + LJ), or that of 
(\/2 L.|[L; — L,]). But the first of these matrices has » + r and the second 
q+rcolumns. Further the rank of K is m — 1, since C > 0 and |K! = 0. 
(By the definition of K its columns sum to zero.) As the sum (p +7) + (¢ +7) 
is equal to the number of edges of M, which is 2m — 2, it follows that 
(12) p = q. 

The number of edges W; of M such that W; = +W; is thus even. We 
denote it by 2n. 

We see also that the matrices L; and L, have each m rows and n columns, 
while L; and ZL, have each m rows and (m — n — 1) columns. Therefore 


*Here by “self-dual” we mean “self-dual with respect to the operation $”’. 








by 


| 
| 
| 





A CLASS OF SELF-DUAL MAPS 189 


U= (V2Li\| [Ls + Ly]) and V = (/2L,|[L, — Li]) have each m rows and 
(m — 1) columns. Since the sum of the elements in any column of HA, is zero, 
the same is true of U and V. These matrices accordingly have the property 
that all their minor determinants obtained by striking out one row are equal 
apart from sign; let us say that the minor determinants are equal to +u 
for U and +v for V. Since K = UU’ = VV’ we have, taking any first co- 
factor in K (which will by definition be the complexity of the 1-section of M), 


C= 2 = x, 
But from its definition u = (\/2)"X, where X is some integer. Hence we 
have 
THEOREM V. The complexity of M is 2"X*, where X is an integer. 


Thus the complexity C of a reflex M is either of the form Y? or else of the 
form 2 ¥Y?, where Y is an integer. 

TuHeoreM VI. If C is of the form Y°* the transpedances of the 1-section of 
M all divide by Y; if C is of the form 2Y*, where Y is even, they all divide by 2Y. 


Let Z denote Y in the first case, and 2Y in the second case. 

First, if W; = +W;, we have (W;.W;) =(W*;.W*;) =(W;.W;). Hence by 
(3), C = 2(W;.W;) and so Z divides (W;.W,). 

For any other edge W; we have by (F), 

C divides ((W;.W;)(W;.W,;) — (W;.W;)’). 
But (W;.W;) = 0, by Theorem IV, and (W;.W;) = (W*;.W*;) = C — (W;.W;), 
by (3). Hence C divides (W;.W;)? and therefore Z divides (W;.W,). 

Next, if W; and W;, are distinct edges of M then by (F) 

C divides ((W;.W;)(Ws.Wi) — (W;.W;)?). 
But by our previous result Z divides (W;.W,;) and (W;.W;). Hence C divides 
(W;.W;)(Wy.W;) and therefore C divides (W;.W;)*. Consequently Z divides 
(W;.W;). 

This proves the theorem for transpedances (ab.cd) in which V, and V, 
are joined by an edge and V, and V, are joined by an edge. We can complete 
the proof by showing that each transpedance is a sum of transpedances of 
this form. This readily follows from (C) and (D). 


5. Squared rectangles. Let M be any map. We orient the edges and faces 
of M as in sec. 3. Let W; be any edge of M. Let the positive and negative 
ends of W; in M be V, and V, respectively. Let the faces of M incident with 
W; be P, and P,. We may suppose that W; is positively incident with P, and 
negatively incident with P,. 

It is clear that the l-sections of M and M* are connected. Hence the com- 
plexities of these maps are positive. We shall denote the complexities of these 
maps by C and C* respectively. 

Let F be the full flow in M with polar edge W; and let F, be the full flow in 
M* with polar edge W*;. 











190 SMITH AND TUTTE 


We may suppose that V, has zero potential in F and that P*, has zero 
potential in F,;. Then the potential of V, in F is (W;.W;) and the potential 
of P*, in F, is (W*;.W*;). 

A vertex V, of M is said to be active in F if there is a non-zero current (in F) 
in some edge incident with V;in M. Since C > 0 it follows from (D) that V, 
and V, areactive in F. If V; is not a pole of F and is active in F it is evident 
from Kirchhoff’s Laws that V; is incident with an edge in which a positive 
current flows to V; and an edge in which a positive current flows from V;. 
Consequently V; is then joined by edges of M to one vertex of M of higher 
potential and one vertex of M of lower potential than V;. 

It follows that, in the flow F, the active vertices of highest and lowest 
potential are the poles. Since C > 0 it follows from (D) that V, is incident 
with an edge in which a positive current flows from V,in F. The other end of this 
edge is either V, or an active vertex of lower potential than V,. From these 
observations we may deduce the physically obvious result that 
(13) (W;.W;) 2 v2 0, 
where v is the potential of any active vertex of M in F. 

Similarly we have 
(14) (W*;.W*;) 2 w2 0, 


where w is the potential of any active vertex of M* in F;. 

Let & be any real number. We say that an edge W; of M comprises ¢ if 
é lies between the potentials in F of the ends of Wy. If W; comprises ¢ 'the 
ends of W; are active in F. So by (13) we have 


(15) (W;.W;) > & > 0. 

Similarly W*;, comprises the real number 7 if 9 lies between the potentials in 
F, of the ends of W*,, and if W*, comprises 7 we have 

(16) (W* ;,W*;) > 1 > 0. 


Suppose that W; is not W; and that the current of F in W, is non-zero. 
Then the set of all points 
(£s.2) 
— g 


\c 
in the (x, y) plane such that W; comprises — and W*, comprises 7 is the in- 
terior of a square E; of side (W;.W;), by (2). By (2), (15) and (16), £;, is 
contained in the rectangle 
(W;W)2y20, C—(W;.W;)2 x2 0. 
We call this rectangle R. 

Let & be any real number satisfying (15) and not equal to the potential in 
F of any vertex M. Let S be the set of all vertices of M whose potential in 
F exceeds £, and let T be the set of all other vertices of M. Let X be the set 
of all edges of M which have one end in S and the other in JT. Thus X is the 
set of all edges of M which comprise ~. X is non-null, for W; € X, by (15). 











A CLASS OF SELF-DUAL MAPS 191 


Let P; be any face of M. Each vertex incident with P; is either in S or in 
T. From this it follows that the number »; of members of X incident with P; 
is even. Also in consecutive members of X in the boundary of P; the positive 
currents flow in opposite directions in this boundary. The current in a mem- 
ber of X is non-zero by the definition of X. 

Let X* be the set of edges of M* dual to the members of X. We say that 
a vertex P*, of M* is £-active in M* if it is incident with a member of X*. 
By the preceding paragraph it follows that the number of edges of X* incident 
with a ¢-active vertex of M* is even. Since W; € X, P*, and P*, are ¢-active 
in M*. If P*; is any other £-active vertex of M* it follows from the preceding 
paragraph, and from equations (2), that in the flow F; the positive current in 
half the members of X* incident with P*, flows to P*,, and the positive current 
in the other half flows from P*;. So then P*, is joined by edges of M* to one 
t-active vertex of higher potential in F; and to one £-active vertex of lower 
potential in F;. 

We can construct a simple arc L in the 1-section of M*, with ends P*, and 
P*, having the following properties: 

(i) Each edge of L is in X*, and L does not contain W*;; 

(ii) The potentials in F; of the vertices of M* in L, taken in order from 
P*, to P*, in L, form a strictly decreasing sequence. 

To construct L we first observe that P*,, being incident with an even 
number of members of X*, is incident with one edge K, of X* other than W*;. 
Let U; be the other end of K;. By the definition of X the current of F in K*, 
is non-zero, and therefore the current of F; in K; is non-zero. So by (14) the 
potential in F; of U; is less than that of P*,. If U; is not P*, it is joined bya 
member of X*, K; say, to a £-active vertex U, of M* of lower potential in F,. 
Similarly if U: is not P*, it is joined by a member K; of X* to a E-active 
vertex of M* of lower potential in Fi, and so on. The sequence K,, Ky... 
must terminate since the number of edges of M* is finite. Clearly the union 
of the edges Ki, Ky... is a simple arc L in the 1-section of M*, with ends 
P*, and P*, having properties (i) and (ii). 

Let L’ be the simple closed curve in the 1-section of M* obtained by ad- 
joining W*; to L. Then L’ is a union of members of X*. 

Any vertex of M must be contained in one of the two residual domains in 
the sphere of the simple closed curve L’. The two ends of a member J of X 
lie in different faces of M* incident with J*. Hence if J* is contained in L’ 
they lie in different residual domains of L’. In particular V, and V, lie in 
different residual domains of L’. We denote the residual domains containing 
V, and V, by D, and D_ respectively. 

The potential in F of any active vertex of M which is in D, must exceed £. 
For let V; be a vertex of M which is in D,, is active in F, and has the lowest 
possible potential in F consistent with these conditions. Since V, is in D, it 
is not V,. Hence it is joined by an edge H of M to an active vertex of lower 
potential. This vertex must be in D_. Hence H intersects L’. Thus H must 











192 SMITH AND TUTTE 


be a member of X. Consequently H comprises ~ and therefore the potential 
in F of Vy exceeds é. 

A similar argument shows that the potential in F of any active vertex of M 
which is in D_ must be less than £. 

We conclude that if J is any member of X its two ends must be in different 
residual domains of L’. Hence J intersects L’ and therefore /* is one of the 
edges of M* in L’. So L’ is the union of all the members of X*. 

Now let 7 be any real number satisfying (16), and not equal to the poten- 
tial in F, of any vertex of M*. By properties (i) and (ii) of the arc L it follows 
that there is just one edge W; of N other than W; such that W; comprises ¢ 
and W*, comprises 7. 

From this result it is easily seen that no two of the squares FE, have any 
interior point in common, and that each point of the rectangle R belongs to 
at least one of the squares E,. We recall that E, is defined only when W, is 
not W; and the current J; of F is non-zero. 

Thus the squares FE; define a squaring of the rectangle R. 

We say that the l-section of M is a c-net of the resulting squared rectangle. 
The network obtained from this l-section by suppressing the edge W; is a 
p-net of the squared rectangle. The highest common factor of the lengths of the 
sides of the squares E,, i.e., the highest common factor of the transpedances 
(W;.W;,) taken for the given value of j and all values of k, is called the reduction 
of the squared rectangle. 

Segments parallel to the x axis will be called horizontal. Segments parallel 
to the y axis will be called vertical. The lengths of the vertical and horizontal 
sides of the squared rectangle R are (W;.W;) and C — (W;.W;) respectively. 
The numbers obtained by dividing these by the reduction of the squared rec- 
tangle are called the reduced horizontal and vertical sides respectively. The 
numbers (W;.W,;) and C — (W;.W;) are called the full vertical and horizontal 
sides respectively. 

A point in R which is common to four of the elements EZ; is called a cross of 
the squared rectangle. 

As a consequence of Theorem VI (leaving aside the case C = 2¥Y*, Y odd) 
we see that the reduction of any squared rectangle having a reflex as c-net is 
a multiple of the reduced horizontal side. This property also holds for the 
reduction of a squared square ({1]). It seems plausible that if one made a 
list of a few hundreds of such rectangles one would discover some perfect 
squares among them. At least the possibility of deriving a perfect square 
from a given reflex cannot be excluded, as it can for most networks, by 
the reduction theorems of [1]. We have not made such a long list; we merely 
draw the attention of more industrious squarers of rectangles to the possibility. 

We have evaluated a few squared rectangles of fairly small order having 
l-sections of reflexes as c-nets. The perfect ones in our list all correspond to 
central reflexes. They are given below in the notation of C. J. Bouwkamp 
((6], pp. 1179-1180). 

















A CLASS OF SELF-DUAL MAPS 193 


In Bouwkamp’s notation the top left-hand corner of each component square 
of a squared rectangle is taken as its “representative point.” The lengths of 
the sides of those squares for which the representative points lie in the same 
horizontal segment (connected component of the union of horizontal sides of 
the elements of the squared rectangle) are bracketed together in the order of 
the representative points from left to right. The brackets read in order from 
top to bottom of the rectangle. When several brackets correspond to col- 
linear horizontal segments they are written in the order of these segments 
from left to right. 

We remark that each one of the rectangles listed below has a cross. It can 
be shown that this is a consequence of Theorem IV. 

Rectangle (1). Order XXII. Full horizontal side (271)*. Reduction 271. 
Reduced sides 271, 257. 
(91, 80, 100), (11, 49, 20), (67, 35), (29, 30, 61), (32, 3), 
(52, 28, 1), (31), (24, 4), (99), (96), (76). 
Rectangle (2). Order XXIV. Full horizontal side (480)*. Reduction 480. 
Reduced sides 480, 456. Side-ratio 20:19. 
(158, 160, 162), (118, 40), (38, 91, 31), (29, 133), (60,) (78), 
(25, 66, 34, 26), (180, 41), (8, 18), (32, 10), (161), (139). 
Rectangle (3). Order XXIV. Full horizontal side (494)*. Reduction 494. 
Reduced sides 494, 418. Side-ratio 13:11. 
(183, 149, 162), (34, 102, 13), (59, 116), (113, 104), (30, 29), 

(1, 28,) (36, 66, 31), (4, 140), (35), (9, 131), (122), (101). 

Rectangle (4). Order XXIV. Full horizontal side (459)*. Reduction 459. 
Reduced sides 459, 401. 

(118, 107, 123, 111), (11, 80, 16), (12, 99), (129), (64, 87), 

(35, 45, 41, 23), (18, 191), (25, 10), (59), (55), (154), (114). 
Rectangle (5). Order XXIV. Full horizontal side (463)*. Reduction 463. 
Reduced sides 463, 464. 

(200, 134, 129), (45, 84), (94, 40), (54, 31), (109, 63, 28), 

(23, 8), (92), (35, 87, 77), (46, 52), (10, 159), (155), (149). 

Rectangle (6). Order XXIV. Full horizontal side (473)*. Reduction 473. 
Reduced sides 473, 435. 
(166, 138, 169), (57, 81), (137, 29), (50, 119), (86), (62, 69), 
(27, 59, 55, 7), (48, 147), (132, 5), (32), (4, 99), (95). 
Rectangle (7). Order XXIV. Full horizontal side (399)*. Reduction 399. 
Reduced sides 399, 429. Side-ratio 133:143. 
(137, 120, 142), (17, 81, 22), (154), (59, 41, 64), (18, 23), 
(106, 34, 13, 5), (8, 84), (21), (55), (139), (138, 16), (122). 
Rectangle (8). Order XXIV. Full horizontal side (424)*. Reduction 848. 
Reduced sides 212, 214. Side-ratio 106:107. 
(79, 62, 71), (17, 36, 9), (27, 20, 33), (53, 24, 19), 
(7, 13), (5, 50, 34), (29), (46), (82), (16, 18), (66), (64). 











194 SMITH AND TUTTE 


The last of these deserve special comment. It is remarkable that any per- 
fect rectangle of the twenty-fourth order should have such small elements. 
Even in the thirteenth order most of the perfect rectangles have larger re- 
duced elements than this. (A list of all the simple squared rectangles of order 
less than 14 is given in [6}). 


6. Central and planar reflexes. For a central reflex the number 1 is zero, 
since @ can transform no point w; into itself. Hence, by Theorem V, the 
complexity of a central reflex is of the form X*® where X is an integer. This 
is exemplified by the full sides given in the above list. 

For any planar reflex M, let Q be the great circle in which the plane of 
symmetry of Z(M) cuts the sphere. By symmetry considerations Q can meet 
the boundary of a face F of Z(M) only in the vertex w; or in the mid-point 
of the opposite side (V;P*;, say). It follows that Q cuts the l-section of M 
only in points w;. An arc in Q joining two consecutive points w;,—let us 
say w;, and w;—on Q is evidently a diagonal of a quadrilateral w;,w;,V:P*, 
composed of two faces of Z(M). 

From this we deduce that if w,, we, ... , wy are points w; on Q, taken in 
their cyclic order on Q, then Wi, W2,..., Ws is a cyclic sequence of self- 
dual edges in which each edge has one end in common with its successor, and the 
other with its predecessor. We say that these edges constitute the girdle 
of M. The girdle of M* evidently consists of the edges W*;, W*2,..., W*,. 

From these considerations it is evident that an edge W; of M satisfies 
W; = +W;; if and only if it is in the girdle of M. 

TuHeorem VII. The edges W; of the girdle satisfy alternately W; = W; and 
W; == Wj. 


Consider the quadrilateral w;,w;,V;P*, mentioned above. If V; is the posi- 
tive (negative) end of both W;, and W;, then W*;, and W*;, are both nega- 
tively (positively) oriented with respect to V*;. (Sec. 3, Prop. (ii)). Then 
P*, must be the positive end of one of them and the negative end of the other, 
by (5). It follows that V; is the positive end of the edges W;,, Wj, and the 
negative end of the other, whence the theorem is true for W;, and W;,. The 
argument when V; is the positive end of one of the edges W;, and W,, is ana- 
logous. 

Thus the edges of the girdle belong alternately to S, and S;. (See sec. 4). 


THEoREM VIII. If W; and W, are distinct edges belonging to the same class 
S; or S:, then (W;.W,) = 0. 


For then (W;.W:) 


ll 


—(W*;.W*,) by (3), 
—(@W* ;.¢W*.) = —(W;.Wi) 
—(W;.W;), by the definition of S; and S). 











A CLASS OF SELF-DUAL MAPS 195 


THEOREM IX. Let W; be an edge of the girdle of a planar reflex M. Then 
the squared rectangle corresponding to the full flow in the 1-section of M with 
polar edge W; is a diagonally symmetric squared square. 


We use the notation of sec. 5. 


Suppose there is an edge W; of M, other than W;, and real numbers £ and , 
such that W; comprises —§ and W*, comprises 7. Then by the symmetry of 
Z(M), Wx comprises » and (W;,)* comprises ~. Hence the elements of the 
rectangle corresponding to W; and W, are reflections of one another in the line 
y =x. The theorem follows. 

It is this theorem that underlies the methods for the construction of perfect 
squares given in [1]. It is easily seen that we can construct a planar reflex by 
the following sequence of operations. First we draw the girdle. Then we 
arbitrarily fix the 1-section of M in the “northern hemisphere,” arranging that 
it shall fit the given girdle and not meet the “equator’’ Q. Then we subdivide 
this so as to form the part of Z(M) in the northern hemisphere. Finally we 
complete Z(M) by reflecting in the equator. 

Fig. 2 shows a planar reflex as seen from above the ‘“‘north pole’. The full 
lines represent the part of M in the northern hemisphere. The broken lines 


represent equally well the part of M in the southern hemisphere or the part of 
M* in the northern hemisphere. 





Ficure 2 


The device adopted in [1] amounts to taking a ‘‘rotor"’ for the part of the 
l1-section of M, apart from the edges of the girdle, in the northern hemisphere 
and then in the resulting planar reflex replacing this rotor, in one hemisphere 
only, by its mirror image. It is found that this destroys the symmetry of the 
squares of Theorem IX without affecting their squareness. For details the 
reader is referred to [1] and to the companion paper which follows immediately. 











196 SMITH AND TUTTE 


REFERENCES 


[1] R. L. Brooks, C. A. B. Smith, A. H. Stone and W. T. Tutte, “The Dissection of Rec- 
tangles into squares,"’ Duke Math. J., vol. 7 (1940), 312-340. 

([2] J. H. Jeans, The Mathematical Theory of Electricity and Magnetism (Cambridge, 1908). 

[3] A. C. Aitken, Determinants and Matrices (Edinburgh, 1939). 

[4] B. v. Kerékjarté, “Uber die periodischen Transformationen der Kreisscheibe und der 
Kugelflache,”” Math. Ann., vol. 80 (1919), 36-38. 

[5] O. Veblen, Analysis Situs (Amer. Math. Soc. Colloquium Publications, 2nd ed. (1913)). 

[6] C. J. Bouwkamp, On the Dissection of Rectangles into Squares, Papers I and II. (Kon- 
inklijke Nederlandsche Akademie van Wetenschappen, Proc., vol. 49 (1946), 1176-1188, 
and vol. 50 (1947), 58-71). 


University College, London 
and 
University of Toronto 











SQUARING THE SQUARE 
W. T. TUTTE 


1. Introduction. It is the object of this paper to describe in more detail 
than has hitherto been done the general methods by which a square may be 
dissected into smaller unequal non-overlapping squares. Examples of such 
dissections are given. 

The problem of finding a method for constructing a “simple” perfect rect- 
angle whose sides are in any given rational ratio remains unsolved. It is found 
however that the theory developed to deal with the case in which the ratio is 1 
can also be applied to construct a family of simple perfect dissections for the 
particular case in which the ratio is 15/17. (A squared rectangle is ‘‘simple”’ 
if it contains no smaller squared rectangle of order >1). 


2. Self-dual maps. It is remarked in the companion paper, [1], that most 
of the general methods for the construction of the perfect squares depend on 
the properties of self-dual maps. (For an exception, see [2], [3] and [4].) In 
the notation of that paper, if the edge W; of M is equivalent under the sym- 
metry of Z(M) to W*;, then in the full flow with polar edge W; we have 


(1) (W;.W;) = C — (W*;.W*;) 
= Cc _ (W;.W;) 
by [1], equation (3). 

Hence the squared rectangle whose c-net is the 1-section of M and the poles 
of whose p-net are the ends of W; has its horizontal side equal to its vertical 
side; it is a squared square ({1], Sec. (5)). 

It is not of course perfect, because of the symmetry of Z(M). We shall see 
however that in some cases it can be modified so as to give a perfect square. 

It was shown in [1] that such “self-dual edges” occur in all planar reflexes 
but not (in general) in central reflexes. Besides the reflexes there is one other 
kind of self-dual map in which a self-dual edge can occur. This is the case in 
which the map M is transformed into its dual map by a rotation through an 
angle +/2 about an axis X through the centre of the sphere. Evidently the 
points at which X cuts the sphere cannot be vertices of M or M*. Neither 
can they be interior points of edges of Z(M). Each must therefore be one of 
the points w;. If they are taken to be w; and we, then W, and W; are evidently 
self-dual edges of M. We shall call the axis X a dualizing tetrad axis. 

It was shown in [1] that by taking an edge of the girdle of a planar reflex as 
polar edge we obtain a flow representing a diagonally symmetric squared 
square. In a similar way it can be shown that if the polar edge of a flow is 


Received March 18, 1948. 
197 











198 W. T. TUTTE 


transformed into its dual edge by a rotation through +/2 about a dualizing 
tetrad axis then the corresponding squared square has the symmetry of the 
swastika. It is thus far from perfect. 


3. Externally equivalent networks. We are left with the problem of turn- 
ing a symmetric squared square into an unsymmetric one. We shall discuss 
one solution of the problem in the present section. 

In this paper, if P; and P; are the two vertices of an electrical network, we 
shall denote the transpedance (P;P;.P;P;), or (ij.ij), by V(P:P;) or Vij. We 
write V,, = 0. 

We shall prove two general propositions about transpedances. They are 
as follows. 

(2) (i) 2(rs.rt) = Vee + Vee — Vaz; 
(ii) 2(rs.tu) = Vou + Vee — Vee — Vow. 
We prove (i) as follows. 
(rs.rt) = (rs.rs) + (rs.rt) = (rt.rt) + (ts.rt), 
by formulae (C) and (E) of the companion paper. Hence, by these same 
formulae, 
2(rs.rt) = (rs.rs) + (rt.rt) — ((sr.st) + (rt.st)) 
= Vig t+ Ver — Ver. 
We may now prove (ii) as follows. 
2(rs.tu) = 2(rs.ru) — 2(rs.rt) 
= (Vee + Vow — Vou) — (Vee + Vee — Vat) 
Vow + Vat ag Vet - V one 


Suppose that Nand N’ are two connected electrical networks:such that C(N) 


=(C(N’). Suppose further that there isa set A = { Ax, Ea An} of vertices 
of N and a set {A’s, A’s,..., A’n} of vertices of N’ such that 
(3) V(A;Aj) = V(A';A";) 


for any members A; and A; of A. 

Then by (2) we have 
(4) (A;A;.A4A1) = (A';A';.A"%A'1), 
where A,, A;, A, and A; are arbitrary members of A. 

By a flow F in N with polar set A we mean a flow of current in N which 
satisfies Kirchhoff’s Laws everywhere except at some members of A. In such 
a flow F let the sum of the currents flowing from A; in the edges of N incident 
with A; be denoted by J;. The full flow in N with poles A; and A, will be 


denoted by F; (j = 1,2,...,”— 1). Because of the linearity of the Kirch- 
hoff equations we may write 


F="5 ( qs i): 
#=1\C(N) ” 











it 


h- 











SQUARING THE SQUARE 199 


We mean by this that any current or potential difference x in F is equal to 


the sum 
a-l1 
I; ) 
* (4. “oe 


where x; is the corresponding current or potential difference in F;. 
We denote by F’; the full flow in N’ with poles A’; and A’, (j =1,2,..., 


n—1). Wewrite 
an-l Z ) 
F = (Jer, , 
* C(N)) ’ 


Then, since C(N’) = C(N) the sum of the currents of F’ flowing from A’; in 
the edges of N incident with A’; is J;. Also by (4) the potential drop from 
A’; to A’; in F’ is equal to the potential drop from A; to A; in F. 

We call F’ the flow in N’ corresponding to F. 


Now let us suppose that N is part of a connected network L and that each 
vertex of N which is incident with an edge of L not belonging to N belongs to 
A. We define an external transpedance of L as a transpedance (rs.tu) with 
the four corresponding vertices P,, P,, P:, P,, so chosen that those in N belong 
to A. 

We consider the effect of replacing N by N’ in L. More precisely we sup- 
pose that N’ has no point in common with L, we identify A; with A’; for each 
i, and then we suppress all the edges and vertices of N not belonging to A. 


We denote the resulting graph by L’. Transpedances of L’ will be distinguished 
by primes. 


THEOREM. The complexity and external transpedances of L are invariant 
under the operaiion of replacing N by N’. 


Consider a full flow @ in L with positive pole P, and negative pole P,, such 
that if either of these vertices is in N it belongs to A. The currents of ® in 
N define a flow F in N with polar set A. Let F’ be the corresponding flow in 
N’. Let us replace N by N’ and F by F’. Then by the definition and proper- 
ties of F’ there results a flow ®’ in L’ which satisfies Kirchhoff’s Laws every- 
where except at P, and P,. The potential difference between two vertices 
P; and P; common to L and L’ is unaltered by this process.' Also the total 
current flowing from P, is the same in L’ as in L. We conclude that # is the 
full flow in L’, with positive pole P, and negative pole P,, multiplied by 
C(L)/C(L’). Hence 
5) (rs.tu) a (rs.tu)’ 

CL) CL’) 


for each external transpedance (rs.tu) of L. 





‘If P; denotes A; in L it is taken to denote A’;in L’. In this case we still say that Pj is com- 
mon to L and L’. 











200 W. T. TUTTE 


Let gdenote the number of edges of Lnotin N. Ifg = 0,then C(L) = C(L’) 
since L = Nand L’ = N’. We assume as an inductive hypothesis that C(L) 
= C(L’) whenever q is less than some positive integer m. We go on to con- 
sider the case g = m. 

In this case let E be an edge of L not in N. Let its ends be P, and P,. Let 
L; and L’, be the networks obtained from L and L’ respectively by omitting 
the edge E. Let L. and L’; be the networks obtained from L, and L’; respec- 
tively by identifying P, and P,. 

Suppose P, and P, are not both in N. Clearly L, and L’; are connected. 
Also L’; is obtained from L, by the operation of replacing N by N’. Hence 
C(L’.) = C(L2) by the inductive hypothesis. Thus (xy.xy)’= C(L’:) = C(L2) = 
(xy.xy) by the definition of complexity and transpedances in terms of deter- 
minants given in the companion paper. Hence C(L’)= C(L), by (5). 

Next suppose that P, and P, are both in N. Then L,; and L’; are connected 
and L’, is obtained from L; by replacing N by N’. Hence, by the inductive 
hypothesis, the transpedance (P,P,.P,P,) has the same value in L’; as in L). 
But by the definitions this transpedance has the same value in L, as in L, and 
the same value in L’; as in L’. Hence (xy.xy)’ =(xy.xy) and C(L’)= C(L) as 
before. 

It follows by induction that C(L’)= C(L) for all values of g. The theorem 
now follows from (5). 

As examples of externally equivalent networks we have various pairs of 
equivalent p-nets ([5], Sec. 7). We return to these later. These are cases 
in which m = 2, m being the number of members of A. 

Another case is obtained by taking N and N’ to be a rotor and its mirror 
image. By saying that N is a rotor we mean that it has n-fold rotational 
symmetry in which A is a set of corresponding points. In this case the ex- 
ternal equivalence follows immediately from the symmetry. Rotors are dis- 
cussed in Sec. (7) of [5]. 

Before explaining a third case we note another general property of trans- 
pedances. Let WN be an electrical network. Let E be an edge of N, of conduc- 
tance 1, with ends P, and P,. Let N’ be the network derived from N by sup- 
pressing E. We suppose that N’ is connected, so that C(N’)>0. We 
distinguish transpedances of N’ by primes. Then for any transpedance 
(rs.tu)’ of N’ we have 
(6) (rs.tu)’ = (rs.tu) — ((rs.tu)(xy.xy) — (rs.xy)(tu.xy))/C, 
where C is the complexity of N. 

To prove this, let F,, be the full flow in N with positive pole P, and negative 
pole P,. We define F,, analogously. We consider the flow 


F= (1 » yz) F,, + 2) F,,. 
C Cc 


It is easily verified that by suppressing E we obtain from F a flow F’ in N’ 
which satisfies Kirchhoff’s Laws everywhere except at P, and P,. Hence 





ST 


- — 





= ++ -—- — © A ee eh h6vF 


— 


y’ 
ce 








SQUARING THE SQUARE 201 


F = \F’,,, where F’,, is the full flow in N’ with positive pole P, and negative 
pole P,. By considering the total current flowing from P, in F we find that 
AC(N’) = C — (xy.xy). But it follows from the definition of complexity and 
transpedances as determinants that C — (xy.xy) = C(N’). Hence \ = 1 and 
therefore F = F’,,. Formula (6) follows. 

The third case of externally equivalent networks, in which nm = 3, arises as 
follows. Consider two consecutive edges of the girdle of a planar reflex M. 
Let their common end be A, and the other two ends Band C. Then by the 
results of [1], 

(7) V(AB) = V(AC) = 3C(M). 


Let us now suppress AB and AC, and distinguish transpedances referring 
to the new network, M, say, by primes. We have 
V’(AB) = V(AB) — (V(AB).V(AC) — (AB.AC)*)/C(M), 
by (6). We have used the fact, evident from the definitions, that a trans- 
pedance (xy.rs) is independent of cz, and c,,. Hence by (7) and (6), 
V’(AB) = V(AC) — (V(AC).V(AB) — (AC.AB)*)/C(M) = V'(AC). 
We take N and N’ to have each the same structure as M;. But whereas in 


N, Ai, Az and A; are taken to be A, B, and C respectively, in N’, A’, A's 
and A’; are taken to be A, C, and B respectively. 


cS 









Ficure 1 


Our first method for the construction of perfect squares is that briefly des- 
cribed in the companion paper. We construct a planar reflex in which the 
part of the network on one side of the girdle is one of two externally equiva- 
lent networks, the vertices of the polar set but no others being on the girdle. 
If we replace this part by the other network of the pair, the squared rectangles 
obtained by taking edges of the girdle as polar edges will still be squared 
squares, by the Theorem, but in general there will be no evident reason why 
these squared squares should not be perfect. Figure 1 shows a planar reflex 
in which the part of the network on one side of the girdle is a rotor. The 
reflex is represented as projected in the equatorial plane. The part of the 
network in one hemisphere is represented by full lines, that in the other by 
broken lines. 











202 W. T. TUTTE 


The particular case in which the girdle has four edges only is easily seen to 
be the case of Figure V of [5]. Any squared square whose polar edge belongs 
to the girdle has two elements which are bisected by one of the diagonals of 
the whole square, and the remainder of the squared square consists of two con- 
gruent rectangles dissected into squares. The externally equivalent networks 
are here equivalent p-nets of squared rectangles. In order that the squared 
square may be perfect it is necessary that the two p-nets shall be perfect. 
Further, no element of one may be equal to any element of the other: i.e., in 
the terminology of [5], the p-nets must be totally different. Unfortunately 
only very clumsy methods of constructing totally different p-nets are known, 
though many simple cases have been discovered empirically. Most of the 
perfect squares of this type can be reduced in order by making a corner of one 
rectangle overlap an element in the other. One that cannot be so reduced is 
described in [6]. Its reduced side is 1015 and its full side is 1015*. Another 
perfect square of this class is described by R. Sprague in [2]. Its order can be 
reduced by overlapping. 


4. Use of planar reflexes. A general method for constructing a pair of 
totally different perfect rectangles is to take a rotor of four-fold symmetry 
and polar set {A:, As, As, As} and another of three-fold symmetry and polar 
set { A, Az, As}, the two having no other vertices in common. We take two 
of these four vertices as poles for one p-net. For the other we take the same 
poles and replace each rotor by its mirror image. 

In general there is no evident reason why the two equivalent squared 
rectangles thus obtained should not be perfect and totally different. I know 
of only one case in which the necessary calculations have been performed. In 
this case the rectangles are totally different and have reduced sides 115407650 
and 160618071. In the notation of C. J. Bouwkamp? they are: 

(48217845, 55448257, 56951969), (30183899, 18033946), (10803534, 15373190, 
27767821, 1503712), (58455681), (12149953, 12117871, 4569656), (7548215, 
12394631), (3712938, 11106732, 4846416), (16831932, 13489890, 8331174, 
3680856), (7393794), (12817296, 32191572), (5587698, 2743476), (2844222, 
11842800, 6556980), (3342042, 7181904, 2536962, 428982), (2107980, 6752922), 
(4644942), (20173974), (19374276), (16334112, 2245656), (14088456) 
and: 

(64077519, 52804978, 43735574), (9069404, 34666170), (24019929, 18016469, 
19837984), (51330131, 12747388), (6003460, 10191494, 1821515), (8369979, 
7530738, 5758782), (38582743, 4188034), (1771956, 7679724, 11964522, 
19008750), (9302694), (14286054, 8463453), (3394926, 4284798), (5822601, 
10317432, 5021040), (4131168, 4062330, 8055822), (68838, 3993492), (5296392, 
3924654), (20108655), (17997156, 1011594), (16985562), (15613824). 

We can construct a perfect square from these two rectangles as explained 
above. Or we may modify the method by making the rectangles overlap in 


*See [7] . The notation is also explained in [i], Sec. (5). 





m_ feat Ad 





a 





SQUARING THE SQUARE ‘ 203 


a suitable corner element. Thus, taking the corner element to be that of 


side 48217845, we obtain a perfect square of the. 85th order and of reduced 
side 227807876. 


Another perfect square of this ‘‘overlapping’’ type is of the 29th order and 
has reduced side 1424. It was discovered empirically. It is: 

(193, 285, 186, 273, 462), (99, 87), (101, 92,) (360), (9, 119, 348), (110), (229), 
(133, 329), (51, 82), (791, 158, 13), (64), (33,49), (81, 16), (65), (633). 

An extreme case of this overlapping method arises when the corner element 
to be overlapped has a side in common with the rectangle to which it belongs. 
Then when we overlap the two rectangles and suppress the overlapped element 
we get a square of the type shown in [5], Fig. 9. Perfect squares of this type 
seem fairly common. There is the one of [5], Fig. 9, which appears to be the 
perfect square of smallest known order (26); there is another of reduced side 
1015, of full side (1015)?, and of the 28th order, given by: 

(280, 372, 363), (188, 92), (93, 270), (119, 261, 84), (177), (165, 23), (142), 
(163, 120, 167, 183, 382), (43, 30, 47), (13, 17), (219), (215), (199), 

and a third, also of the 28th order, with reduced side 1073 and full side (1073)* 
given by: 

(244, 153, 248, 169, 259), (91, 62), (79, 90), (29, 33), (364), (360), (349), (465, 
252, 156, 89, 111), (67, 22), (133), (135, 88), (221), (213, 39), (174). 

The second of these squares is given in [6]. 

It is interesting to note that there are two quite different perfect squares of 
the 28th order having reduced side 1015 and full side (1015)*. Their first 
publication seems to have been in a note by A. H. Stone, ({6]). One is com- 
pletely described above; the other has been completely described by C. J. 
Bouwkamp ([(7], p. 75). 

We go on to consider the case in which the planar reflex has just 6 edges in 
the girdle. In general the corresponding symmetrical squared square will 
have just three elements which are bisected by the symmetry diagonal. (See 
Fig. 6 of [5].) Evidently perfect squares derived from such a planar reflex by 
replacing the part of the network on one side of the girdle by an externally 
equivalent network contain smaller squared rectangles. That is they are not 
simple. Such a smaller squared rectangle is formed by the middle diagonal 
element and the elements above the bisecting diagonal. 

The structure, but not the elements, of one such perfect square is given in 
[5] ((8.2), second paragraph). The elements have been published since by C. 
J. Bouwkamp.’ Its reduced side is 1813. 

Another example, due to R. L. Brooks, is as follows: 

(2378, 1163, 1098), (65, 1033), (737, 491), (249, 242), (7, 235), (478, 259), 
(256), (324, 944), (219, 296), (1030, 829, 519, 697), (620), (341, 178), (163, 
712, 1564), (201, 440, 157, 31), (126, 409), (283), (1231), (992, 140), (852). 
Its reduced side is 4639. Both these squares are of the 39th order. 


7], p. 75. 











204 : W. T. TUTTE 


In [5], Sec. 9, a certain infinite sequence of perfect squares is discussed. 
Its members are all of the above type. 

Consider the case where the reflex has eight edges in its girdle. In general 
the symmetry diagonal of a corresponding squared square bisects four of the 
elements (Fig. 2). In this case we may hope to derive simple perfect squares. 
































FiGuRE 2 


One simple perfect square of this type (due to C. A. B. Smith and W. T. Tutte) 
is known. It is of the 52nd order. In Bouwkamp’s notation it is as follows: 
(51573, 41645, 88851), (9928, 31717), (41320, 20181), (14795, 5386), (3778, 
12450, 15489), (9164), (9411, 3039), (245, 8919), (15040), (18528, 14970, 
26382, 47499), (32624, 8696), (6174, 12156), (192, 5982), (23928), (3558, 
11412), (18138, 10056, 12030), (4176, 10721, 22897), (8082, 1974), (11635, 6545) 
(56552, 20693, 5527), (5090, 12176), (15166, 7086), (1780, 45719), (43939), 
(35859). 

The reduced side of this perfect square is 182069. 

The externally equivalent networks used for the perfect squares of reduced 
sides 1813, 4639 and 182069 are rotors. 


5. Overlaps. In our discussion above of the case of a girdle with four edges 
we used the device of making two equivalent squared rectangles overlap in a 
corner element. A similar but less trivial process is applicable when the num- 
ber of edges in the girdle exceeds four. 

Suppose we have a c-net N, and suppose P;, P;, P,, P: are four vertices of 
N, the pairs (P;,P;) and (P;,P;) being joined by edges. Suppose further that 

(i) (jk) =0 
and (ii) Vii = $C(N). 


Equation (ii) states that the edge P,P;, taken as polar edge corresponds 
to a squared square. 

We obtain a new “electrical” network N» from N by changing the conduc- 
tance of the édge joining P, and P; from 1 to —1. Then C(N,) = C(N)— 
2Vii = 0 by the definitions, and the properties of determinants. We may 
therefore no longer assert that the Kirchhoff equations for a flow in N» have 
a unique solution. 

Consider the full flow in N with positive pole P; and negative pole P;. In 
virtue of (i) this remains a flow, satisfying Kirchhoff’s Laws except at P; and 
P;,in No. We denote it by F. Next consider the full flow in N with positive 





ive 








—————— 





SQUARING THE SQUARE 205 


pole P, and negative pole P;. This will give rise to a flow Fy in Ny when we 
reverse the current in the edge whose conductance is changed to — 1. We 
note that by (ii), 7» must satisfy Kirchhoff’s Laws everywhere. It has in fact 
no poles. 

It follows that for any A, F + AF, is a flow in No with positive pole P; and 
negative pole 7; in which the current entering at P; and the potential differ- 
ence between P; and P; are independent of A, being C(N) and V;; (referring 
to N) respectively (by (i)). An example of such an N,j is given in Fig. 3. It 
corresponds to the generalized squared rectangle of Fig. 4. It will be noticed 
that the edge of N» of conductance — 1 corresponds to a square region Y in 
which two elements corresponding to edges of conductance 1 overlap. By 
dissecting perfectly the square X in different ways and arranging in each case 
that the region of overlap coincides with a corner element of this dissection 
(which, together with the “element’’ corresponding to the conductance — 1, 
is then suppressed) it is possible to obtain a number of perfect dissections of 
the rectangle whose sides are in the ratio 15:17. By suitable choices of the 
perfect square involved we can even obtain simple perfect dissections of this 
rectangle. 





X 3-x | 4-x 


5+x S+x 








“x 


























3x 3x 10-x 


10-x 7+x 
7-x 








DD 











vvY Condvetance 1), 
Ficure 3 FiGure 4 





Now suppose that in No, P; is joined to a vertex P,, other than P; by an 
edge of conductance 1. Then in the flow F + AF, the potential difference 
between P; and P,, is given (in terms of N) by 


(8) (ij.lm) + A(Rl.lm). 


Now, provided that P; can be joined to P; by a simple arc in N passing through 
Pm, we have (kl.lm) # 0 (by the results of [1] Sec.(5)). Hence we can choose A 
so that the potentials of P; and P,, become equal. The resulting flow con- 
tinues to obey Kirchhoff’s Laws (for poles P;,P;) when P; and P,, are identi- 
fied. Further, after this identification is made we can suppress two edges 
joining P, and P; = P,,, one of conductance 1 and the other of conductance 
— 1 without affecting the matrix {c,,}. If the resulting electrical network 











206 W. T. TUTTE 


Ni, in which each conductance is now + 1, is the p-net (with poles P;,P;) of 
a squared rectangle, the sides of that rectangle must be in the same ratio as 
those of the rectangle corresponding to the poles P;,P; in the p-net N. We 
describe this operation as overlapping the edges P,P; and P:P,, of N. In 
order that the rectangle derived from JN, shall not be “trivially imperfect’’ 
it is necessary (in general) that P, shall be incident with at least five edges in 
N, and that N shall contain no quadrilateral constituted by P,P;, P:P,, and 
two other edges.’ Usually we are given that N is planar; we can then arrange 
that N, is planar by taking P,P, and P:P,, to be consecutive in the cyclic 
sequence at P; of the edges incident with P;. 

The dissection of the rectangle with sides in the ratio 15:17 by the use of 
the network of Fig. 3 can be discussed in terms of this overlapping operation. 
We first replace one of the edges P;P; by a p-net of a perfect square and then 
we overlap the edge P,P; with one of the edges of this p-net. 

Let us now return to the equations (i) and (ii). If in addition the network 
N satisfies (iii), Vi; = }C(N) we say that the edges P;P; and P,P; are squarely 
conjugate. Then P;P;, as well as P,P; corresponds to a squared square. Hence 
the edge P;P; in any planar network N, formed from N by overlapping P,P; 
also corresponds to a squared square. 

For an example we refer to Theorem VIII of the companion paper. We see 
that in the girdle of any planar reflex any two members of the same class S; 
or S: are squarely conjugate. By the Theorem of the present paper this 
property is unaffected if the reflex is modified by replacing the part of the 
network on one side of the girdle by an externally equivalent network. 

It is easily seen, by considering the flows F and F, that any two members 
of a set of mutually squarely conjugate edges remain squarely conjugate when 
any other member of the set is overlapped. Also as a general result any zero 
current in the p-net of a squared square which corresponds to an edge squarely 
conjugate to the polar edge in the corresponding c-net can be eliminated by 
overlapping the corresponding edge. 

By applying this method to the members of S; in the girdle of a suitable 
modified planar reflex (with m > 2) it is possible to obtain a simple perfect 
square having no crosses.® 

This method is essentially the same as that given in Sec. (8.4) of [5], but 
the present account is more general. A recent criticism by C. J. Bouwkamp’ 
of the method as described in [5] has now been withdrawn.® 


‘If a p-net has a part, not containing a pole, joined to the rest by only two edges, or if it 
has a pair of vertices joined by two (or more) edges, these two edges will clearly have numeri- 
cally equal currents. If these two currents are non-zero we say that the p-net and its corres- 
ponding squared rectangle are “‘trivially imperfect”’. 

5We ignore the exceptional case in which P; and the fourth member P, of a quadrilateral 
P,P iP mP q have equal potentials in F + AFo. 

"See [1], Sec. (5). It is easily seen that a zero current in a p-net corresponds to a cross in 
the corresponding squared rectangle, unless this zero current belongs to a part of the p-net 
joined to the rest at only one vertex P and containing no pole which is not P. 

7], p. 75. [8] and [9]. 





— 


—_—— — nail 


i fll il ua at fh me i 


— 2s. = 80 a6 














SQUARING THE SQUARE 207 


Two examples of simple crossless perfect squares obtained by this sort of 
overlapping have been published in [8] and [9]. We give below a perfect 
square containing one cross, obtained by eliminating one of the zero currents 
in a modified planar reflex. This is of particular interest in that the externally 
equivalent networks involved are not rotors. They are derived from the 
planar reflex of Fig. 2 of [1] according to the method described in Sec. (2). 
Fig. 2 of [1] is lettered in accordance with that description. The perfect 
square, due to R. L. Brooks, is of the 38th order and has reduced side 4920. 
It is: 

(1348, 1092, 893, 1587), (199, 694), (256, 420, 615), (1440, 164), (584), 
(120, 984, 1177), (281, 454), (108, 173), (692), (627), (217, 527, 240), (47, 1130), 
(2132, 534, 310), (287), (224, 900), (758), (104, 1026), (82, 922), (840). 


6. Use of a dualizing tetrad axis. The next method to be described utilizes 
a planar reflex which has a dualizing tetrad axis perpendicular to the dualizing 
plane. (It can be shown that such a map is also a central reflex.) A reflex of 
this type, from which perfect squares have been derived, is shown in Fig. 5. 
It will suffice to describe the application of the method to this reflex. 

If X and Y are any two vertices of the reflex, we denote by F(X Y) the full 
flow with positive pole X and negative pole Y. 







‘ 
e 
a 


ooo” 


FiGureE 5 


As we saw in Sec. (1) the flows F(AB) and F(CD) correspond to squared 
squares (with the symmetry of the swastika). Also (AB.CD) =0 by Theorem 
IV of the companion paper. Thus the edges AB and CD are squarely conjugate. 
Choose two diametrically opposite edges such as GH and MN in the girdle. 












208 W. T. TUTTE 


We proceed to show that AB and CD remain squarely conjugate when the 
edges GH and MN are suppressed and the two vertices G and H identified. 
To do this we consider the linear combination of flows F(AB) + bF(GH) 
+cF(MN). We try to arrange that 
(9) (AB.GH) + 0(GH.GH) + c(MN.GH) = 0, 

(AB.MN) + b(GH.MN) + c(MN.MN) = cC 


where C is the complexity of the reflex. Since (MN.GH) = 0 (Theorem VIII 
of the companion paper) and since (GH.GH) and C— (MN.MN) are non- 


zero, ({1], Sec. (5)), we can do this by setting 6 = — (AB.GH)/(GH.GH) and 
c = (AB.MN)/(C — (MN.MN)), that is b = ¢c = — (AB.GH)/}4C since the 
map is a planar reflex and (AB.GH) = — (AB.MN) by symmetry. 


It is easily seen that when (9) holds, and we perform the operation 
of suppressing the edges GH and MN and identifying the vertices G,H we 
obtain from F(AB) + bF(GH) + cF(MN) a flow in the new network which 
satisfies Kirchhoff’s Laws everywhere except at the vertices A and B. The 
total current flowing from A in this flow is C. The currents in AB and CD in 
this flow are the same as in F(AB) since 6 = c and (by symmetry) (AB.GH) 
= — (AB.MN) and (CD.GH) = —(CD.MN). Hence AB and CD remain 
squarely conjugate. (To prove that CD still corresponds to a squared square 
we use a similar argument with F(CD) replacing F(AB).) 

We can apply a similar operation to any other pair of diametrically opposite 

edges of the girdle. Indeed we can operate simultaneously on several such 
pairs provided that all the edges of the girdle concerned belong to the same 
class S, or S,. Then the operations do not interfere with one another (a con- 
sequence of Theorem VIII of the companion paper). In an actual computa- 
tion the edges EF, GH, IJ, KL, MN, and OP were suppressed, and the pairs 
(E,F), (G,H) and (J,J) were each identified. The flow AB was then found 
to represent a simple perfect square of the 70th order and of reduced side 
384948. This square is: 
(74378, 83540, 71817, 71781, 83432), (60130, 11651), (11723, 60094), (65216, 
9162), (95083), (56054, 48371), (41113, 48772, 78710), (72887, 48383), 
(33454, 7659), (32106, 62977), (26493, 29938), (24504, 23879), (9708, 
23746), (23048, 3445), (19603, 28332, 65393, 30871), (19549, 4330), (97391), 
(14038), (1181, 21911, 14692, 13994, 19928, 8729), (20730), (37061), (34522, 
59326), (8060, 5934), (7219, 7473), (25862), (49606, 254), (111, 7949), (7838), 
(15787), (44887, 108934, 24804), (84130), (30446, 19160), (75076, 22315), 
(64047), (52761). 

As AB and CD are squarely conjugate, this square contains a cross corres- 
ponding to the edge CD. This cross can be eliminated by overlapping CD 
as explained in the preceding section. One method of doing this converts the 
above perfect square into one of the 69th order having reduced side 7919535. 





1e 


t) 


SQUARING THE SQUARE 209 


It is: 

(1543151, 1726140, 1477594, 1469823, 1702827), (1236819, 233004), (248546, 
1229048), (1360162, 182989), (1935831), (1177173, 980502), (887428, 985625, 
1573316), (1507596, 1029739), (789231, 98197), (614300, 1294531), (496131, 
587691), (477857, 551882), (404571, 91560), (313011, 560159, 1367466, 
653231), (222716, 566515), (1985453), (430799, 121083), (470434, 247148), 
(343799), (807307), (396716, 34083), (714235, 1233527), (362633, 224917, 
243260, 113587), (584021), (137716, 87201), (68858, 174402), (50515, 105544), 
(947580), (279946), (961572, 2272111, 519292), (1752819), (598613, 348967), 
(1523173, 462280), (1310539), (1060893). 


This square is simple and has no cross. 


Postscript 


Mr. T. H. Willcocks, of 24 Pembroke Rd., Clifton, Bristol, England, has 
written to say that he published in the Fairy Chess Review (August, 1948), a 
dissection of a square into 24 unequal squares, the reduced side being 175. It 
is hoped that a note by him on this and other dissections will appear in this 
Journal shortly. 


REFERENCES 


{1} C. A. B. Smith and W. T. Tutte, “A Class of Self-Dual Maps,” Can. J. Math., vol. II, 
no. 2 (1950). 
[2] R. Sprague, “Beispiel einer Zerlegung des Quadrats in lauter verschiedene Quadrate,” 
Math. Zeit., vol. 45, (1939), 607-608. 








[3] — , “Uber die Zerlegung von Rechtecken in lauter verschiedene Quadrate,” Journal 
fiir die reine und angewandte Mathematik, vol. 182 (1940), 60-64. 
(4) , “Zur Abschatzung der Mindestzahl inkongruenter Quadrate, die ein gegebenes 


Rechteck ausfiillen,”” Math. Zeit., vol. 46 (1940), 460-471. 

[5] R. L. Brooks, C. A. B. Smith, A. H. Stone and W. T. Tutte, “The Dissection of 
Rectangles into Squares,"”” Duke Math. J., vol. 7 (1940), 312-340. 

(6] A.H. Stone, “Question E 401 and Solution,” Amer. Math. Monthly, vol. 47 (1940), 570-572. 

[7] C. J. Bouwkamp, “On the Dissection of Rectangles into Squares,” Papers I-III, 
Koninklijke Nederlandsche Akademie van Wetenschappen, Proc., vol. 49 (1946), 
1176-1188, and vol. 50 (1947), 58-78. 

, “On the Construction of Simple Perfect Squared Squares,"’ Koninklijke Neder- 
landsche Akademie van Wetenschappen, Proc., vol. 50 (1947), 1296-1299. 

{9} R. L. Brooks, C. A. B. Smith, A. H. Stone and W. T. Tutte, “A Simple Perfect Square,” 
Koninklijke Nederlandsche Akademie van Wetenschappen, Proc., vol. 50 (1947), 1300- 
1301. 





University of Toronto 














WATER WAVES OVER A CHANNEL OF FINITE DEPTH 
WITH A SUBMERGED PLANE BARRIER 


ALBERT E. HEINS 


1. Introduction and formulation of the problem. This is the third in a 
series of problems in the study of surface waves which have been disturbed by 
the presence of a plane barrier and to which a solution may be provided. We 
assume as in part I,' that the fluid is incompressible and non-viscous, and that 
motion is irrotational. The differential equation to be solved is 
(1.1) V*S = %,.. + Py, + 4,, =0, 
where ®,, denotes a second partial differentiation with respect to x, ®,, with 
respect to y, etc.; ®(x,y,z) is the velocity potential of the fluid, and from it 
we may find the components of velocity in the fluid. On a rigid surface, the 
boundary condition is that there be no component of velocity normal to the 
surface. Translated into terms of , we have ®, = 0, where the subscript n 
denotes outer normal derivative. On a free surface? 4, = 8 where 8 is a 
physical constant which is positive. The time variation which normally appears 
in @ has been suppressed by the assumption that it is monochromatic. We 
shall assume, as in part I, that the z variation of @(x,y,z) is harmonic. That 
is, &(x,y,z) = exp (tkz)¢(x,y) so that equation (1.1) reduces to 
(1.2) Orz + Pyy — k* = 0, 
while the boundary conditions remain unaltered. 

The geometric region over which we wish to solve equation (1.2) is a chan- 
nel of finite depth a, but infinite in length. Parallel to the floor of the channel 
is a semi-infinite rigid barrier which is 6 units of length from the floor (b < a). 
In an xyz coordinate system, this figure may be described as follows. 

(i) y=0,-— © <x < ©, — @ <2 < o: the floor of the channel, 

(ii) y=b,x2 0, — ~ <2 < o: the semi-infinite rigid plane barrier, and 
(iii) y=a,—-7 <x < w, — ~ <2 < om: the free surface. 

Since the z variation has been suppressed, equation (1.2) is two-dimensional 
and we shall show that the present boundary value problem may be formulated 
as an integral equation of the Wiener-Hopf type. Furthermore, the analytical 
conditions required for the solution of a Wiener-Hopf integral equation are 


Received December 9, 1948. Presented to the American Mathematical Society, April 1948. 

1Albert E. Heins, “Water waves over a channel of finite depth with a dock’’, American 
Journal of Mathematics, vol. 70 (1948), pp. 730-748. Henceforth we shall refer to this paper 
asl. The second part of this series is entitled ‘Water waves over a channel of infinite depth 
with a dock” and is to be submitted for publication shortly. Reference to the physical back- 
ground of this problem discussed in this paper may be found in I. 

*The constant £ is defined in Sec. 5 of this paper. 


210 











WAVES OVER A CHANNEL OF FINITE DEPTH 211 


satisfied here, and hence we are in a position to solve the integral equation. 
We shall find that the Fourier transform of the unknown function in this 
integral equation provides us with interesting mathematical properties of the 
solution. Because we employ Fourier transform techniques here, we shall 
obtain the desired transform as a by-product of the work. 

One feature which distinguishes this problem from the one treated in part I 
is that we do not have to assume the presence of a two dimensional line source 
to provide a travelling wave solution for x — — © or forx-—+©,y>b. We 
shall simply require that for x - — ©, ¢(x,y) be asymptotic to 

(a; exp (ixx) + 6: exp (—ixx)] cosh poy/a, x = po*/a* — k’, 
where po is the real positive root of the transcendental equation 
p sinh p — af cosh p = 0. 
For x — «, y > b, we shall assume that ¢(x,y) is asymptotic to 
[a2 exp (ix’x) + B2 exp (— ix’x)] cosh p’o(a — y)/c, 
where c = a — b, and p’o is the real positive root of* 
p’ sinh p’ — Bc cosh p’ = 0, Kk? = p'f/e — Rk’. 
These asymptotic forms are obtained by considering first the asymptotic form 
of the solution of equation (1.2) when the semi-infinite barrier is not present, 
and second when the semi-infinite barrier extends to negative infinity. The 
main point here is that if we are sufficiently far away from the point x = 0, 
y = 6, the above two asymptotic forms present themselves as the bounded 
solutions of two well-known potential problems. In order to insure that we 
obtain the bounded solutions for x—>@ and x—>— ©, we require further that 
«and «’ be real. The complex exponential notation describes travelling waves 
to the right and the left in the x direction. The convention is that exp (ixx) 
represents a travelling wave to the right while the one with the negative sign 
is travelling to the left. We shall find that there exist two sets of linear 
relations between aj, a2, 6:, and 82, and thus we can find the amplitude of the 
reflected and transmitted waves to the left and to the right of x = 0. 

The formulation of this problem proceeds along the lines which we des- 
cribed in part I. We may express ¢(x,y) in the strip in terms of an appropriate 
Green’s function G(x,y,x’,y’) and the discontinuity of ¢ across the barrier 
x2 0,y = b. We find that we can produce the travelling wave solutions with 
a source free ¢ simply by demanding the mode of excitation which we des- 
cribed above, so that there is no difficulty in applying Green’s Theorem save 
for |x| very large. The Green’s function satisfies an equation of the form (1.1) 
except at the point x = x’, y = y’. At this point* 

*In order to formulate the integral equation, this asymptotic form need not be specified so 
definitely. Indeed from the Green's function which we employ, we shall find that (x,y) need 
only grow less rapidly than exp [—p':x/a] for x—> ©. With the solution of the integral equa- 
tion we shall find that ¢(x,y) has the prescribed asymptotic form for x—> ©, y> b. 

‘For further details see A. Sommerfeld,‘‘ Die Greensche Funktion der Schwingungsgleichung”’, 


Deutsche Mathematiker Vereinigung, vol. 21 (1912), pp. 309-353. ° In particular, for a discus- 
sion of the logarithmic character of the Green's function employed here, see I, p. 735. 











212 ALBERT E. HEINS 


I, 
0 


As for the boundary conditions in the Green's function, we take 


z=s'+0 @ jy=9 +0 


dy= —1 and | Gs dx = — 1. 
-—o@ y 


=y—0O 





z=x —0 


G,=0 when y=0, — ~ <x < @, 
and 

G, = 8G when y=a,- © <x< @, 
We have described this Green’s function elsewhere and we simply give it 
here for reference. We have 





Glxy.x'y/) = $ Ox t2*B*)(cos pny/a) (cos pny’ /a) exp{ — | pn?-+a*k*} tx —x'|/a} 
atti n=l (pn? —a8-+a"8*)(p,?-+a*k*)! 

__ [cosh poy/a}[cosh poy’ /a}[po* —a*6"}[sin x |x —x’|+sin «(x —x’)] 

ax(Ba+ po? —a*6*) ; 





where po is the real positive root of 
(1.3) psinh p — Bacosh p = 0 


while the p, are the positive imaginary roots of equation (1.3). In passing, we 
observe that G(x,y,x’,y’) = Ofexp{ p:?+a*k*} * (x —x’)/a] for x’—-x— © and 
Olsin «(x—x’)]| for x—x’ > @. 

We have from Green’s theorem that 


o(x,y) = f[G(P.P’)one(P’) — (P’)Ga-(P,P’)\ds’, 


where the path of integration is a rectangle with a cut along y = b, x > 0. 
More precisely we follow the sequence of line segments given below in the 
same order. (Here / and /, are sufficiently large positive numbers.) 


x=-ik tox = l, y = 0; 

y =0 toy =b-—0, x =I; 

x=! tox = 0, y=b-—0; 

x=0 tox =], y=6b+0; 

y=b+0 toy = 4a, x = I; 

x=l tox = —k, y =a; 
and y=a toy = 0, x=-—hk, 


The paths below and above the line y = b, x > 0 are connected by a line 
segment which does not cross the rigid barrier. In view of the boundary 
conditions imposed on G(P,P’) and ¢(P) we get immediately that 


I 
o(x,¥) _ I. [(x,b + 0) —(x,b —0)|Gy(x,y,x’ ,b)dx’ 


+ [o: exp (ixx) + 81 exp (—*x)] cosh poy/a 
+ Ofexp (— 4:/:)] + Ofexp (— 4)], 
where 6 and 6; are two positive constants. Clearly then 





we 


nd 


he 


ine 


——— 





WAVES OVER A CHANNEL OF FINITE DEPTH 213 


(1.4) o(x,y) = I. I(x’)Gy(x,y,x',b) dx’ 


+ [a; exp (ixx) + 8; exp (—ixx)} cosh poy/a, 
when / and /, become infinite. I(x) is the discontinuity of (x) across the bar- 
rier, that is ¢(x,b + 0) — ¢(x,b — 0). Now from the equation (1.4) we may 
form the desired integral equation by noting that ¢,(x,b) = Oforx 2 0. Hence 
we have the integral equation 


(1.5) | I(x!) yy-(x,b,x’ ,b) dx?’ 
0 


+ [po/a)la: exp (ixx) + 8; exp (—tkx)]sinh pob/a = 0, x > 0. 
This is an inhomogeneous integral equation of the Wiener-Hopf type because 
of the limits of integration and the particular x variation of its kernel. 


2. The solution of the integral equation. We first rewrite equation (1.5) 
so that it is defined for all x. To this end we write 


(2.1) | I(x’)Gyy(x, b, x’, B)dx’ + dolz) = W(2), 
where 
I(x) = 0, sz <0; 


¥(x) = 0, x > 0; 
and 


do(x) - 0, x< 0, 
= [po/a]la; exp (ixx) + 8, exp (—ixx)} sinh pob/a, x > 0. 


Before we attempt to apply Fourier transform techniques to equation (2.1), 
we investigate the nature of the growths of I(x) and ¥(x) for x —~ © orx—— @ 
as the case may be. In the first place, 


I(x) = O [exp (+%x’x)], x—> ©, 


As we have remarked, it is not necessary to state this so specifically. For 
x7 — @, 


v(x) = o| [exp { (ox? + a%k*)*(x — x’)/a}] I(x’) ax’. 
which implies the aden of the integral 
[exp { —(p:? + a*k*)*x'/a} I(x’) dx’, 
and this has been ele by our assumption on J(x). It also implies that 
v(x) = O [exp (pi? + a*k*)*x/a], x — @. 


We assume throughout that ¥(x) and J(x) are integrable over any finite interval 
of the x axis and this, of course, is subject to verification with the solution. 











214 ALBERT E. HEINS 


Having this information, we can make some pertinent statements regarding 
the regions of regularity of the Fourier transforms of I(x), (x) and G(x,b,x’,b). 

We first examine the bilateral Fourier transform of the Green’s function. 
We have 


@ 


G(x, y, 0, y’) exp (—iwx) dx 
@ 


g(w,y,y’) = | 


[y cosh y(a — y’) — Bsinh y(a — y’)] 


= cosh 
baad yly sinh ya — 8 cosh ya] 


»x¥<y, 





om om j _ 1 
= eth vy [y cosh ve y) — Bsinh y(a — y)} gra, 
vly sinh ya — 6 cosh ya] 





where y = (k* + w*)' and g(w,y,y’) is regular in the strip — (p;*+a*k*)*/a 

<Imw<0. We have already described these calculations and their justi- 

fication in Part I. We now turn tothe study of the transforms of I(x) and p(x). 
The Fourier transform of I(x) is 


J(w) = [exp (—iwx) I(x) dx. 


Subject to the verification of the integrability of I(x) inO < x <L(L> 0), 
we know that J(w) is regular in the lower half plane Im w<0, since I(x) = 
Ofexp (+ ix’x)]. On the other hand, the transform of ¥(x) is 


Vw) = l ¥(x) exp (—twx) dx, 


and this is regular in the upper half plane Im w > — (p;? + a*k*)*/a, where 
once again we assume the integrability of ¥(x) in —Li < x <0 (Li > 0) 
subject to verification. Finally, the transform of ¢o(x) is 


&,(w) = [ew (—iwx) o(x) dx, 


and this is regular in the lower half plane Im w <0. There is, then, a com- 
mon strip of analyticity between the transforms J(w), V(w), So(w) and 
Zvy(w,b,b), namely —(p,?+-a*k*)'/a < Imw <0, and we are thus permitted 
to take the Fourier transform of equation (2.1) to get 


a J(w) y sinh yb [y sinh yc — 8 cosh yc] 





(2.2) + &(w) = ¥(w). 


[y sinh ya — 8 cosh ya] 

The next task which confronts us is the factoring problem. That is, we ask 
if we can arrange equation (2.2) so that the left side is regular in the lower 
half plane of regularity while the right side is regular in the appropriate upper 
half plane of regularity. To accomplish this, we write 


_ ysinh yb [y sinh yc — 6 cosh ye] _ L_(w) _ poy 





[y sinh ya — 8 cosh ya] L(w) 





—_— —~ — a 





Re ee 


—_—_— 


ae we 





WAVES OVER A CHANNEL OF FINITE DEPTH 215 


where L_(w) is the factor of L(w) which is regular in the lower half plane 
Im w < 0, while L,(w) is that factor of L(w) which is regular in the upper half 
plane Imw> — @ where @ is the smallest of the three constants «/b, 
(p+ a%k)*/a, (p22 + Ck*)*/c. Let us suppose that we have carried out this 
factoring explicitly. Then equation (2.2) becomes 





J(w)L_(w) + [po/a} [sinh pob/a] EE «) 4 Bibs (— 2] = L,.(w)¥(w) 








(2.3) iw—x) i(w+e«) 
+ [p0/a] [sinh pob/a} [a {L+(«) —_ L,(w)} + B { Lif =9 = hte | 
1(w — x) i(w + x) 


The left side of equation (2.3) is now regular in the lower half plane Im w < 0, 
while the right side is regular in the upper half plane Imw > — @ (@> 0), 
and both sides are regular in a common strip —@ < Imw <0. Hence the 
left side of equation (2.3) is the analytical continuation of the right side and 
both sides are regular everywhere. That is 





(2.4) J(w)L_(w) + [p0/a] sinh pob/a [ +, Be 9) = E(w), 
i(w — x) i(w + x) 








and 
(2.5) L4(w)¥(w)+[p0/a] sinh pob/a [a {Ly "7 = hata 
i(w —« 
{L4(—«) —L4(w)} | . 
i(w-+n) E(w), 


where E(w) is an entire function which is yet to be determined. 

The determination of E(w) depends in part on the explicit factoring of L(w). 
In order to do this, we write L(w) in a product representation which exhibits 
its poles and zeros explicitly. Consider first 


y sinh yb = b(k? + w*) Tr {1 + 7°b?/n*x’]. 
For the sake of convenience we assume k > 0. Then 
y sinh yb = b(w + ik)(w—ik) Tr (1 + Bb?/n?x? + wb? /n*x*| 
= b(w + tk)(w — ix) OI 1 + kb?/n*a?}* + iwb/nx] exp ( — iwb/nr) 


x tI [{1 + 2b*/n*x*}* —iwb/nx] exp (iwb/nx), 
n=l 


where the exponential factors have been inserted to insure the absolute con- 
vergence of the infinite products. The term 


P,(b, w) = (w—ik) It {{1 + kb? /n*x*}* + iwb/nx| exp ( —iwb/nr) 
n=1 











216 ALBERT E. HEINS 


is free of zeros in the lower half plane Im w < +2, while the remaining factor 
Q1(b, w) is free of zeros in the upper half plane Im w> — k and P;(b, w) Q:(}, w) 
= ysinh 7b. 

We now examine the expression ysinh ya — Scosh ya. It has two real 
zeros ya = +po and a sequence of imaginary zeros ya = + ip,,m = 1,2,.... 
Furthermore for m sufficiently large and positive, p, = nx + O (6a/nr). 
We find here that 


7 sinh ya — B cosh ya = — 6 (1 — y*a*/po?) IL [1 + y’a*/p,’). 
a=1 
Upon factoring this as we did the sinh yb, we get 


+ sinh ya — Scosh ya = P;(a,8, w)Q2(a, 8,2) 
where 


P,(a,B,w) = —a*B(—w*)/ po? a [{1 + a*k*/p,*}* + iaw/pn] exp (— iaw/nz) 


is free of zeros in the lower half plane Im w < 0, while Q,(a, 8, w) is free of 
zeros in the upper half plane Im w > —(p:/a){ 1+a*k*/p;*}*. Finally, upon 
replacing a by ¢ and therefore p, by p’», we find the product decomposition 
for y sinh yc — 6 cosh ye. 

These individual factors enable us to write L_(w) and L,(w) explicitly. 


We have 
L_(w) = exp [x(w)|P1(}, w) P2(c,8, w)/P2(a,8,w), 
and 


Ls(w) = exp [x(w)] Q:(6, w) Qe(c, 8, w)/b Q2(a, B, w). 


The introduction of the exponential factor exp [x(w)] requires some comment. 
We shall presently examine L_(w) for |w| —~ ~, Imw < 0, and L.(w) for 
\w| + ©, Imw > —@. It will thus be found that L_(w) and L,(w) are of 
exponential order. The factor x(w) will be chosen in such a fashion as to make 
them both of algebraic growth in their respective half planes of regularity as 
|w| > ©. 
In order to find the asymptotic form of L_(w) for |w| — ©, Imw < 0, we 

recall first that 

Pn = nx + Ba/nxasn— oo, 
and 

pn = nx + Bc/nxrasn— @~. 


Furthermore, we may neglect the terms (ka/p,)* relative to unity for |w! 
sufficiently large. Hence L_(w) is of the order 

@o @ 

Il [1 + twh/nz] exp (—iwb/nx) Il (1 + iwe/nz| exp (—iwc/nz) 
w exp [x(w)] *— nat 





1! {1 + iaw/n| exp (—iaw/nz) 


n=1 









| 





oo 





WAVES OVER A CHANNEL OF FINITE DEPTH 


But 


1/T() = ye” Tl [1 + 9/n] exp (—9/n). 
Hence L_(w) is of the order 
exp [x(w)]I'\(taw/w)/T(tbw/ x) I (icw/r). 


Upon applying the Stirling expansion theorem for the gamma function we 
find that L_(w) is of the order 


w”* exp [x(w) + iw/x(a log a — blog b — c log c)} 
for |\w| - ©, Imw <0. We then choose 


x(w) = — iw/x(a log a — b log b — c log c) 
so that 
L_(w) = O(w*) 


for |w| - ©, Imw <0. A similar calculation gives us that L,(w) = O (w~*) 
for |\w| > @, Imw> — 6, with the same choice for x(w). We note, as a 
check, that for |w| - @, 

L_(w)/Li(w) = O(w) 


in the strip of regularity, as it should be. 

In order to determine E(w), the entire function of separation, we are required 
to examine the asymptotic forms of equations (2.4) and (2.5). Because we 
anticipated our calculations and inserted x(w) into the factoring of L(w), we 
see that E(w) is at best of algebraic order for |w| —@, that is, a polynomial. 
We can say more than this about E(w). For example ¥(w) approaches zero 
for |w| — ~,Imw> — @, asa consequence of the Riemann-Lebesgue lemma. 
Since L,(w) = O(w~*) in this half plane and the remaining terms in equation 
(2.5) are O(w~*””), it follows that E(w) = o(w-*). But since E(w) is an entire 
function, it follows that E(w) is zero in the upper half plane Im w> — @, 
\w| + @. We now examine equation (2.4) and find that E(w) = o(w'), that 
is, E(w) is constant in the lower half plane Im w < 0, |w| > ©. From this 
we conclude immediately that E(w) is zero. We have finally 


J(w) = - po sinh pob/a [sete + bisi= 9) 


aiL_(w) w— kK w+« 





which tells us that J(w) = O(w~*’) for |w| + ©, Im w < 0, and hence that 
o4(x) — o_(x) = O(x*), x + 0*, that is, ¢4(x) — ¢_(x) is integrable in the 
neighbourhood of the origin. Similarly we find that ¥(w) = O(w~*), for 
\w| + ©, Imw> — 8, so that ¥(x) = O(x*), x + 0~. 


3. The determination of ¢(x, y). In order to determine ¢(x, y), we write 
equation (1.4) in Fourier integral representation. Upon doing this, we get 











218 ALBERT E. HEINS 


(3.1) o(x,y) = | exp (twx)J(w)gy(w, y, 5) dw 
QnJr 


+ [a: exp (ixx) + 6: exp (—ixx)] cosh poy/a, 


where I is a contour drawn in the strip of regularity — @< Imw<0. We 
have two representations for g(w,y,b) depending on whether y > 5 or y < b. 
The path T is closed, for x > 0, by a semi-circle passing between the poles on 
the positive imaginary axis and the radius of the circle is then allowed to 
become infinite. Because of the growth of gy(w,y,b) and J(w) in the upper 
half plane this is a legitimate closing of the contour [. These details are des- 
cribed adequately by many authors and we shall not pursue the matter 
further. For x < 0, a similar closing is performed in the lower half plane. 

We have then three representations for ¢(x,y) depending on whether x < 0, 
0S yf a;x20,0K y <b; or x2 0,5 < y<K a. We determine these 
from equation (3.1) by a direct evaluation of the residues and the appropriate 
form of the Green’s function. For x < 0,0 < y < a we have 


(3.2) (x,y) = [a1 exp (sx) + 6; exp (—ixx)] cosh poy/a 


— po sinh pob/a + = _ E at 4+ peat | 





yc {sin pab/a) (cos pny/2) pn (pn' +a") | 
wa*(p,* + 6*a* — a*6*) 





The summation with respect to w is over the sequence 
w= — i(p,? + a*k’)*/a, (s = 1,2,...). 
For x 2 0,0 < y < b we have 





3) ofs,y) — — PeSimbed/e [eels 4 Abe(—2)] 





w-—K w+« 
[= n wy/b) exp (el) 
wh( —)"L(w) 





__ Po sinh pob/a Eee + file(s) leap (—2kx) : 
aL (tk) ik — « tk +x 2tkb 


In equation (3.3) the summation with respect to w is the sequence 
= i(k + n*x*/b*)*, (s = 1,2,...). 


w 
Finally for x 2 0,b < y < a, we have 


' 











WAVES OVER A CHANNEL OF FINITE DEPTH 219 





(3.4) (x,y) ~ Po sinh pob/a {{ e+ + ate(—0)| exp (ix’x) 
e—e ete J Lye) 


x Eee _ Bibs =] exp isan (6'? — 6°) cosh p’o(b—y)/c 











+k | ed Li(—«’) «'c(Bc — Bc + p’*) 
_ po sinh pob/a <. exp (wx) {nie 4 ib+( = 
a w Ls(w) w-K w-K 


- {cos p’n(b — y)/c}{ Bc? + p’n?} 
cw(p’,,? — Bc? + Bc) 


where now the summation with respect to w is on the sequence 





w = i{k’+p',?/c}* (mn = 1,2,...). 


Let us now examine the convergence of the infinite series in equations (3.2), 
(3.3) and (3.4). In equation (3.2), the general term of the infinite series is of 
the order 

(exp nxx/a)(sin nxb/a)(cos nry/a)/n'”, a>> i, 


so that the series in (3.2) converges absolutely for x £ 0,0 < y { a. For 
equation (3.3) we have for the order of the general term 


(cos nry/b) exp (—nxx/b)(—)"/n'”, a>> ti, 


while for equation (3.4) we have 


exp (—nxx/c) cos [nx(y—b)/c]/n™, n>>t1, 


so that the infinite series converge absolutely for x 2 0 and 0 < y < 6b or 
b < y < a as the case may be. From the order of the general term we may 
deduce the behaviour of ¢(x,y) in the neighbourhood of x = 0, y = b. Let us 
take equation (3.2) first. We write for abbreviation r? = x* + (y — b)*. Then 
@ 
x exp (nxx/a)(sin nxb/a)(cos nry/a)/n'”* = O(1 +r”) 
n=l 
for x —0-,y—b. That is, (x,y) is bounded and its derivative with respect 
to r becomes infinite in such a fashion that r¢, is finite, indeed zero, for r — 0. 
A similar remark may be applied to the cases x ~0*, y+ b~ and x ~0*, 
y—b*. Hence this application of Green’s theorem in the neighbourhood of 
the point x = 0, y = d is justified. Since the series expansions in equations 
(3.2), (3.3), and (3.4) converge both uniformly and absolutely in the regions 
given above, it is a simple matter to find their asymptotic forms. For example, 
forx-—+> — ~,0 < y < a, ¢(x,y) is asymptotic to 


[cosh poy/a}[a: exp (ixx) + 6, exp (—ixx)). 


The next term in the expansion is Ofexp {p:° + a*k*}'x/a]. For x— @, 
0 < y <b, o(x,y) = O [exp (—kx)], while forx— ~,b Ly Le, 











220 ALBERT E. HEINS 


(x,y) = O [as exp (ix’x) + B2 exp (—ix’x)). 
It is clear, then, that one can take the unilateral Fourier transforms of ¢(x,y) 
for positive x, as long as Imw <0. Finally, we can show that since the 
integral in equation (3.1) is uniformly convergent with respect to both x 
and y for 0 < y { a, — © <x < @, it defines a continuous function of x 
and y in this region. Hence the three representations of ¢(x,y) which we 
have are continuous across the line x = 0. 


4. Reflection and transmission properties of the barrier. If we examine 
equation (3.4) for x — ©, we find that the only bounded term is 


Po coh wilt {[ +0 4 Al +( = exp (ix’x) 
a a w—kK «+k L(x’) 
+ [see _ Ais) lop (“#9 {cosh p’o(b—y)/c} |p’? — Bc} 
«+k c—« Li(—x«’') cx’ (p'? — Bc? + Bc) 


We have here the required linear relations between a;, a2, 8: and B:. For 
example 


— po(sinh peb/a)(p' o? — 6*c*) kes £ Ate) l ; 
ack’ (p' @ — B*c? + Bc) x —K +k L(x’) 

















and 


Bs 











__ Po(sinh pob/a)(p'? — 6*c*) [see 4 BL +( =] 1 
ack’ (p'? — Bc? + Bc) « +« xn —k Li(—x’) 


Now a; is the amplitude of the wave incident upon the barrier, so that 8; is 
the amplitude of the wave reflected from the barrier for x < 0. Hence, if we 
require, for example, that no wave be incident from the right, that is 6. = 0, 
then a, is the amplitude of the wave transmitted to the right. In this case 
the reflection coefficient on the left is r; = 8:/a;, while the transmission 
coefficient on the right is 4; = a2/a;. On the other hand, if there is no wave 
reflected to the left, then a; = 0, and the reflection coefficient on the right is 
’2: = a2/B2, while the transmission coefficient on the left tg = 8;/Bs. 

It is not difficult to give r;, 72, t; and #, in terms of k, 8 and the ratioc/a. In 
the first place, since L(x) is the conjugate of L,(—«), we have 








(x’ — x) 2 
n= cp (2% 
1 ( +) exp (2i01) 
where o; = arg L(x). Similarly 
nw —« ? 
= — 2 
To 5 a? ( 102) 


where o2 = arg L(x’). Furthermore 


— (po sinh pob/a) (p's? — Bc?) L(x) 4uac , 
(p'® — Bc + Bc) Ls (n’)(x’ + x) (apc? — Cpe’) 





ee 








WAVES OVER A CHANNEL OF FINITE DEPTH 221 


while 


_ ae! — aden! (p's? — Bet + Bc) Ly — «’) 
pe(sinh pob/a)(o's? — BA) L( — «) 





We are then left with the task of providing the magnitude and phase of 
Ls(+«) and L4(+x’). In the first place 





























' IL4(+«)? = (Ba + po — Sa*)(a*p’? — parc) 
2ap' (po? — *a*) sinh *pob/a 
while 
fs -™ 2 
[Ls(a «'))? = ee | 
p'o(p'e?a® — po’c*) (Bc + pe? — Bc) 
Hence 

: Ly(+«) P _ Ge + o's? — 6*)(Ba + pot — Ba*)(a*p's? — pate’)? 
Li(+«’) 4c*a(p7 — 6'a*)(p'? — Bc*) sinh* pob/a 
From this we see that 

mo ef eb fee — Beh" fot + be — Pe exp sler—on), 
} '  2Xe po — Ba?) lp’? + Be — Bre’) k+« 
' and 





sins me {eV fod -- oy {ee + Bc — = i(o1 — 02) 
po lad We—Be) lot + ba — Ba e+e 


The phase angles o; and o2 are given by the following infinite series 


@ 
o1= > {arc sin xa/{p,? — pot}* — xa/nr} 
a=1 


| — Z {arc sin xc/{p’.2 — po} — xc/nn} 

a=1 
} © 

— £ {arc sin xb/{n*x® — po?}* — xb/nx} 

n=1 
} — arc sin ka/po — — {a log a — b log b — c log c}, 
T 
o:= > {arcsin«‘a/{p,? — p'e}* — «‘a/nz} 

n=l 


@o 
— £ {arc sin «’c/{p’,2 — p'e}* — «’'c/nx} 
n=1 
» Pa 
— > fare sin «'b/\ n?x?— pe} — «'b/nx} — arc sin ke/p'o 
n=1 


— — {a log a — b log b — c log c} . 














222 ALBERT E. HEINS 


It is clear that the mth term in any of the above infinite series is O(1/n*), so 
that the series all converge. It is not difficult to calculate these series as 
functions of k and 8. 


5. A reciprocity theorem. We have shown how ai, az, 8; and #: can be 
used to define the various reflection and transmission coefficients. Let us 
observe that ¢(x,y) is complex and in order to obtain a real solution we merely 
have to take either the real or imaginary parts of 


exp (tkz + ift)(x,y), 


where Sg = f? and g is the acceleration of gravity in appropriate units. Now, 
in complex form we may show that a relation exists between the magnitudes 
of a1, a2, 81, and 82. We start with Green’s theorem. If ¢*(x,y) denotes the 
conjugate of ¢(x,y), then 


(5.1) J Slov’e* — o*v*e]dA = f[de*, — o*bnlds’, 


where the area is the region we have described in sec. 1. That is, it is a rec- 
tangle of length L+JL; and width a with a horizontal cut parallel to the x 
axis as we have described. The left side of equation (5.1) vanishes since 
Vo = k*¢ and V*¢* = #*¢*. Further, because of the boundary conditions on @ 
and ¢*, there are no contributions to the line integral along the rigid barriers 
or the free surface. Finally since there are no sources, we have 


= | [o¢*2 — $*¢2),--1dy + Ko — $*oz),-1 dy = 0. 
0 


But, if we choose L and L; sufficiently large and positive, we have that the 
integral atx = — Ly is 


2i« [\as|? — |61)"] | cosh? (poy /a)dy 


— ix’ [|a2|? — |B2|"] [cosh ab — y)/c} dy 


+ O [exp (—é@L)] + O[exp (— @L,)] = 0. 
Hence when L and L; — , we have simply that 
xa [|as|? — |x|") [o0? + Ba — fa) _ «’c{\ao|* — |B2\"} [p’o® + Bc — fc") | 
po’ — Ba? pi? — Pe 


Upon substituting in the expressions we found for a2 and #2 in terms of’a; and 
a2 we find that the above relation is identically satisfied. 





The Carnegie Institute of Technology 


ee wr 


ee 





owvr ms Ff 


1e 


nd 

















THE RELATION BETWEEN FUNCTIONS SATISFYING 
A CERTAIN INTEGRAL EQUATION AND 
GENERAL WATSON TRANSFORMS 


F. M. GOODSPEED 


1. In some work of Ramanujan’ certain results are given which are equiva- 
lent to the following. 











If 
(1) |, F(x) F(ux)dx = — 
0 1 u 
and , f(s) = I, F(x)x*—"'dx 
0 
then 
(a) f(sfa —s) =—_. 
sin ws 
Also, if G(x) = /(2) F(ix) + F(—ix) 
7 2 


then the relations 


[4 (x) G(xy)dx = By) 
0 
(b) 


l, B(y) G(xy)dy = A(x) 


are consequences of one another for arbitrary functions A(x). 

Examples can be given of both the truth and falsity of these results. 

The function F(x) = e~* satisfies (1) and gives f(s) = I'(s) and G(x) = 
V(2/x) cos x. Therefore (a) is true, and by Fourier integral theory the for- 
mulae (b) are consequences of one another for functions A(x) with suitable 
properties. 

On the other hand (1) is satisfied by 


— /(2) 1 +a 


which yields f(s) = ~/(x/2) sec $s and G(x) = 0 for all values of x except 
+4. In this case (a) is true, but the formulae (b) are certainly not conse- 
quences of one another under any circumstances. 





Received June 14, 1949. 
See, for example, Ramanujan by G. H. Hardy, Chapter 11, Formula F. 


223 











224 F. M. GOODSPEED 


Taking the results as they stand, if (1) need hold for real values of u only, 
then F(x) need be defined for real values of x only, and there is no obvious 
reason to suppose that F(ix) or G(x) can be defined in any sense. 


2. The results proven in this paper show that if F(x) € L*(0,~) and satisfies 
(1) then (a) is true and F(x) is the value taken on the real axis by an analytic 
function F(z), regular for R(z) > 0. Also, the formula 


Gi(x) = lim /(2)f. F(u + iv) + F(u — iv) dv 
u—0 T 0 2 


defines a function G,(x) for almost all positive x, and G;(x)/x is a Watson 
kernel, i.e. Gi(x)/x € L?(0,©) and the relations 


d |, Gi (xy) 





A(x) ——— dx = B(y) 


0 x 


dy 
and 


@d | B(y) Gi(xy) dy = A(x) 
y 


are consequences of one another for A(x) € L*(0,-). 


3. The tools most often used in the following work are L* Mellin transform 
theory and general Watson transform theory. The main results of the Mellin 
transform theory used are given? in T., Theorems 71 and 72 with k = 3, 
while the Watson (or general) transform theory is described in T., Chapter 
VIII. 

THEOREM 1. Let F(x) belong to the class L?(0,~ ) and satisfy (1) for allu > 0. 
Let f(s) be the Mellin transform of F(x) defined in the mean square sense for 
R(s) = 4. Then 
(2) f(s fl — s) = T(s)TU—s) 
for R(s) = 3. 

For, applying the Parseval formula for Mellin transforms (T., Theorem 72) 
to the left-hand side of (1), we have 








[soya \u-"ds = — 
Oni J yi — a ae 
since f(s)u~* is the Mellin transform of F(ux). But 
1 §+ico 
me I'(s)P(1 — s)u-‘ds = 
wie (TC - , l+u 
1 $+i@ 
and so | {f(s)f( —s)-—T(s)Ti - s)} u~*ds = 0. 
mi J 4 —-i@o 


Since (3 + at) and f(} + #) are L*(— ~,-), f(s)f(1 — s) — I(s)T(1 — 5s) is 
L(4 — i@, § + im) and so the inverse form of T., Theorem 32, can be applied. 


*The abbreviation T. is used for E. C. Titchmarsh, Introduction to the Theory of Fourier 
Integrals, Oxford, 1937. 


se 











b 


er 


2) 














WATSON TRANSFORMS 225 


This yields the formula f(s)f(1 — s) — I'(s)'(1 — s) = 0 and proves Theorem 
1. 

Theorem 1 shows that ithe assertion (a) holds rigorously for functions of 
L(0,@). 


THEOREM 2. Let F(x) be a real function defined in (0,~) satisfying the con- 
ditions of Theorem 1 and let f(s) be defined as before. If 





f(l—s) 
Bie) on eee 
(3) e-—— 
and 
(4) Ki(x) i“ 1. R(s) — 
x 2mi Ji-i~o 1 —s 


the integral being defined in the mean square sense, then K,(x)/x belongs to the 
class L*(0,@) and is the kernel of a Watson transform and 


eo eo 
(5) F(x) = x | Ki(tje"*"dt = — x € | K,(t) e~**dt 
0 dxjo t 
for almost all values of x. The function F(x) is therefore almost everywhere the 
value taken on the real axis by a function F(z) = F(x + iy) regular in the right 
half-plane. 
The statement that K,(x)/x is the kernel of a Watson transform means that 
it has the same properties as the function k,(x)/x of T., Theorem 129. 
Using the results of Theorem 1 we have 


6) k(s)k(l — s) = LA =9) L 

I(s)P(1 — s) 
Also, as F(x) is a real function, f(s) and therefore k(s) take conjugate values 
for conjugate values of s and so 


\e(} + at)| = |k(} + at)R(4 — a#)|' = 1. 
Therefore K,(x)/x, as defined by (4), is a function of L*(0,~ ) and is a Watson 


kernel. 
We also have for almost all x 


Seip m8 = 2] MO — TUES 
$+i10@ — $+i@ = 

= | a=) sI'(s)x~‘*ds = +] Mi=s) I'(s+1)x~*ds 
}-i@ 5 Rt) 4—i@ Ss 


F(x) 


Qi 
eo @o 
= | K,(?) xte~**dt = x | K,(t)e~* "dt. 

0 t 0 
The first integral is the usual inverse Mellin transform, the second is obtained 
by using (3) and finally we apply she Parseval formula for Mellin transforms 
and use the fact that k(s)/(1—s) is the transform of K,(t)/t and that ['(s+1)x~* 

is the transform of xte~**. 











226 F. M. GOODSPEED 


The second formula for F(x) in (5) can be obtained from the first by an 
appeal to general theory of reversion of the order of integration and differ- 
entiation, or by differentiation with respect to x of the relation 


[ ao. o kL — s)P(s) 


x t Qni }-i~ Ss 
@ 
~ [EO na 
o ft 


This relation is obtained by using the Parseval formula for Mellin transforms 
and the information that the transform of F(t) is k(1 — s)I'(s), the transform 
of the function equal to 1/t for ¢ 2 x and zero elsewhere is — x*~!/(s — 1), 
and the transform of e~*‘ is I'(s)x~*. 

The integrals over the range (3 — i~, $ + i~) occurring in this transfor- 


mation are mean square integrals in Mellin transform theory, but, as k(1—s) 
. ‘ —si6! 
is bounded and I'(} + it) = O(e"2 ), they are absolutely convergent as well, 


and so may be taken in the ordinary sense. 

COROLLARY 1. The condition that F(x) be real in Theorem 2 may be dispensed 
with if k(s) as defined by (3) is bounded on R(s) = }. 

For in the argument used in proving Theorem 2 the condition that F(x) be 
real is only used to prove the boundedness of k(s) on R(s) = 3. 


COROLLARY 2. Any function F(x) defined by (5), where K,(t)/t is a Watson 
kernel, will satisfy (1). 


x~*ds 


If K,(t)/t is a Watson kernel, the final part of the argument may be reversed 
and we obtain 


1 $+i@ 

F(x) = | k(1— s)T(s)x~*ds 
221i Ji-i~ 

where k(s)k(1 — s) = 1. Thus F(x) is the Mellin transform of k(1 — s)I(s), 

and applying the Parseval relation, 


fos) 1 §+i@ 
| F(x) F(ux)dx = ray | k(s)T(1 — s)R(1 — s)T(s)u~*ds 
0 m1 joo 


1 §+10 : _ 
= oni —? r(s)T(1 — s)u~*ds = rd 
THEOREM 3. Let F(x) satisfy the conditions of Theorem 2 and let F(z) be its 
continuation, regular for R(z) > 0. Then if 


Gdas) = (2) { F(u + 4) + F(u — #0) 4, 


the limit as u — + 0 of Gi(u,x)/x exists for almost all x in (0,~) and defines a 
function G,(x)/x which belongs to the class L*(0,~) and is the kernel of a Watson 
transform. The Mellin transform of Gi(x)/x is +/(2/x) f(s)cos 4xs/(1 — s) 
where f(s) is the Mellin transform of F(x). 












on 





a 


WATSON TRANSFORMS 


We have 





x 


=) \. F(u + iv) + F(u — iv) 4 
T 0 2 


foe 
~ wt) juis 2 


and using (5), with u > 0, 


iG;(u,x) = 4/(2 =) | icp 
utiz 
0  -W/() [i (-# kf ten) a 
Yc eerie ata 


In order to calculate the limit of G:(u,x) as u — 0 we refer to T., Theorems 
94 and 95. After a change of variable amounting to the rotation of the com- 
plex plane through a right angle, these theorems show that if (x) has a 
Fourier transform ¢(x) which is null for x < 0, then 


G,(u,x) = 


; 1 -” : 
&(u + ix) = Van) \, g(t)e—(*t tay 


converges in mean square as well as almost everywhere to ®(x) as u — 0. 
Here (x) is taken to be 


| Kilt) e~ *=tdt 
0 t 


where the integration is in the mean square sense, and its transform ¢(x) is 
J/(2r)Ki(x)/x for x 2 0 and zero for x <0. Hence, as u 0, 


lo} 
| a e~** +2) (dt = &(u + ix) 
0 
, , . fr 
converges in the mean square sense and also in the ordinary sense almost® 


everywhere to 
eo 
[oA ea 
0 t 


But mean square convergence over the infinite range in x implies mean con- 
vergence with index 1 over any finite range, and so 


z Ifo © 
tim [ | LO oot a -{ ae e-wvat| dy =0 
0 0 


u—0J—z 


utiz oo te - 
or lim | aw | Ki) wig = | aw | Kit) --wigy 
u—-O0 J u—iz 0 t =e n t 




















228 F. M. GOODSPEED 


Therefore the limit of iG,(u,x) as u — 0 is 


1 4/(2 =)\- i «| = Ei) pistyy — ix |, Ae e-*=tdp 
+ \" dw [, ae evar 
=—4 V/(=\eI. Ki) cos xtdt — I. dy |, Ki) cos mat ‘ 
© 0 t 0 0 t 





Hence 
(3) 2) 2 jim 2) 
x u—0 x 
= // (2) 1 I; dy |, = Ki) . ytdt — ( =), = ) cos xtdt 
wy X Jo 0 
or 





9) AE)» 4/(2)[” Ei sins gy — 4/(2)[” EH ce ea 
x oa 0 t xt T 0 t 


Another and more direct method of deriving (8) and (9) from (7) is as follows. 
In the first expression in (7) for iG,(u,x), if we reverse the order of integration 
(as is justified by uniform convergence), carry out the integration with respect 
to w and then rearrange we obtain 


2 Le *) , eo 
Gi(u,x) = V/(2X«] Kil) e~“* sin xtdt — x | Ki) e~“* cos xtdt 
Ls 0 t 0 t 


foe] ° 
+ | K,(t) sin xt at) 
0 t t 


Now applying the Parseval relation for cosine transforms to the integral in 
the second term of this expression, we have 


2\[(° hag 
W/(2)] Kilt) et cos xtdt = -_ | K.(y) { : Uu re u ay 
of rJo w+(x+y)*  wt+(x—y)? 


where K.(y) is the cosine transform of K,(t)/t and 


14/ (= ext ° ste! 


is the cosine transform of e~“‘ cosxt. But by the theory of the Cauchy 
singular integral (T., Theorem 13 and 1-17), 








” u 
lim -| K.(y) ————— dy = 
lim 7 (y) P44) y 
7 u » 
“ in), EO) ea ~ Kee 








hy 





- 








WATSON TRANSFORMS 229 


for almost all positive x if K.(y)/(1 + y*) € L(0,-), which is true since 
Ky) € L°(0,©). Therefore 


lim ‘e)N Kit) e~“* cos xtdt = (2), Kit) cos xtdt 
u—0O us 0 t 7 0 t 


where the integral on the right-hand side is taken in the mean square sense. 

If the cosine is replaced by a sine a similar result holds. 

The above is just a proof that the integrals of the L* cosine and sine trans- 
forms are Abel summable to the same value as is obtained by mean square 
methods. 

These results give the limits as u — 0 of the first two terms in the expres- 
sion for G;(u,x), while the limit of the third is just the Abel sum of a convergent 
integral, since K,(t)/t and sin xt/t are both L*(0,~). As the limit of the first 
term is zero our final result is (9) as before. 





Now the first term in (8) is of the form 
1 fz 
B(x) = © i. a(y)dy 


where a(y), being the cosine transform of a function of L?(0,@), is also L*(0,@ ). 
Hence A(x) is of the class L*(0,-) (T., p. 396). The second term of (8) is the 
cosine transform of K;,(#)/t (a mean square integral) and is also L*(0,~) and 
so G;(x)/x is L?(0,-) as well. 

In order to complete the proof of Theorem 3 it must now be shown that 
G;(x)/x as defined by (8) or (9) is the kernel of a Watson transform. Three 
methods of proof will be used, the first by finding the Mellin transform of 
G,(x)/x, the second by using the properties of the resultants of Watson kernels, 
and the third by using a known property of the kernels themselves. 


Using the Parseval relation for Mellin transforms, 











@ : §+ic@ = 
| ——. £_t_e + | RL = 5) sin de(s — 1)T(s — 1)x7*ds 
0 t xt 2ri J 4i—i@o s 
$+i@ -_ 
= Z. | a — 9) cos 42s Iv) x~*ds 
221 J i—i~ s l-s 


as the Mellin transforms of K,(t)/t and sin xt/xt are k(s)/(1—s) and sin $2(s—1) 
I'(s — 1)x~* respectively. 


Similarly, 





4+ ic die 
= oe | as | cos $as Ms) f'-*ds, 


4 -ico s l—s 











230 F. M. GOODSPEED 


€ $+i@ ass 
| dx + | a(1 — 5) cos $s I'(s)x~*ds 
2x1 $ 


0 4-—ic@ 


$+i@ —— 
=. | ‘ m= 3) cos xs Ms) t'-* ds 


22i J j—i~o s l-s 


as the Mellin transform of the function equal to unity in (0, £) and zero other- 
wise is £*/s and the transform of 


BG a k(1 —s) 


- cos 4asI'(s)x~*ds 
2n1 


}-—i1@ Ss 


is k(1 — s) cos4asT(s)/s. Therefore, differentiating the last two formulae 
with respect to é, 


co $+i@ _ \ 
| KW cos xtdt = . + J Sok cos }#s I'(s)x~*ds 
0 t 241 J i—iw s 


for almost all x. Hence 


Gi(x) _ 2\ 1 fitim p1 — s) ( 1 ) ” 
—— = /(2) =) — cos $s I'(s) Pty 1 }x~*ds 
a Vz 1 [*© f(s) cos 4x md 
™ T 2x1 }—i@o 1 — § a ° 


almost everywhere, and so the Mellin transform of Gi(x)/x is +/(2/2)f(s) 
cos $as/(1 — s). But if g(s) = /(2/m)f(s) cos $xs then 
g(s)g(1 — s) = (2/x)f(s)f(1 — s) cos 4x5 cos $x(1 — s) = 1 

by (2) for R(s) = 4. Therefore, if g(s)/(1 — s) is the transform of G:(x)/x, 
g(s)g(1 — s) = 1 for R(s) = 3. 

Also, using the fact that f(s) takes conjugate values for conjugate values of 
s, |\g(4 + at)| = 1 for tin (— ©, @),. 

Therefore, by T., Theorem 129, G:(x)/x is the kernel of a Watson trans- 
form and the Theorem is proved. 


For the second method of proof, we first transform (8) by using T., Theorem 
69. This Theorem yields the following result: if K,(t)/t has the cosine trans- 


form 
a £ 
y/ (=)| Ai) cos ytdt 
us 0 t 


@o 
then | M Ki(u) du has the cosine transform 
t 


u 
2\ 1 f- © Kilt) 
ot) ioe d oe tdt. 
V2): )- |, t ies 























x, 


of 








WATSON TRANSFORMS 231 


Therefore 


(2): I. dy 7 K,i(?) cos yidt = /(2)\. cos nit | Ki(u) du 
e}/X Jo 0 t T/Jo t u? 


and so, using (8), 


(10) oe) (2). cos xt (| Eil*) 4, - Exi)) dt. 
x T/Jo ' # t 





Now if Li(x)/x, Mi(x)/x and N,(x)/x are Watson kernels with Mellin trans- 
forms J(s)/(1 — s), m(s)/(1 — s), and n(s)/(1 — s), and if By(x)/x is the M 
transform of L;(x)/x and C,(x)/x the N transform of B,(x)/x, then the Mellin 
transform of B;(x) is 1(1 — s)m(s)/s and that of 

Ci(x)/x is l(s)m(1 — s)n(s)/(1 — s) 


(T., 8.6). But if p(s) =1(s)m(1 —s)n(s), then p(s)p(1 —s) = 1 and so C;(x)/x 
is the kernel of a Watson transform. 








Now let 
Lifx) j0 (@<x<1), a 
ali, iis (1 <x) es 
Mi(x) _ K(x) m(s) = k(s), 
x x 





Ni(x) = V/ (2) sin x : n(s) W/(Z)r cos $rs, 
x ™ x ad 


where K,(x)/x is the kernel in the formula for G:(x)/x. Then 








Bue) _ (Kile 
x dx J1 e 
aSy |. EO & = r a) g 2 
dx z 2 z f? x 


and 





x T™}dxjo tft ; # t 
2\ ro oe = , 
4 W/(=)| — (| Ky(u) a m0" at 
T)Jo t u? t 


where the final integral is to be taken in the mean square sense. The final 
transformation is valid since the last two expressions are only different forms 
for the cosine transform of a function of L*(0,@). 

But this is just Gi(x)/x. Therefore G;(x)/x is the kernel of a Watson trans- 
form and its Mellin transform is the Mellin transform of C;(x)/x, 


i.e. l(s)m(1 — s)n(s)/(1 — s) = W/ (=) k(1 — s)I(s) cos $as/(1 — s) 


= y/ (2) cos $xs/(1 — s). 











232 F. M. GOODSPEED 


For the third method of proof, we write formula (10) in the form 


Gi(x) = /(2)\. cos xtdt |, Ki(u) — Kilt) == du 
x TsJ0 t u 


where the inner integral is a function of ¢ of the class L*(0,~) and therefore 
the outer integral is a mean square integral. Hence the cosine transform of 
Gi(x)/x is 
© Ki(u) — Kilt 
| K,(u) : i(t) tie 
t u? 
and therefore that of G,(ax)/ax is 
ro , a / 
| Ki(u) - Ki(t/a) du. 
t/a u2 
Using the Parseval relation for cosine transforms, 


10) = | SO) CLC) gy | ar | Eso) Ki) ay | Kilv)—Kilt/a) 4, 








0 t t 0 t u? t/a v 
_ a | =a Ki(ut) — K,(t) au | Ki(to/a) — Ki(t/a) 4 
0 fJ1 u2 1 y? 


The order of integration in the final triple integral may be changed as this 
integral is absolutely convergent. For 
[Si [Ki(ut)|+1Ki()| 5, |, Ki(to/a)\+|Kilt/a)| 


o 








1 u? 1 v- 








oo] ao ; | , ie *) , . in 
| ar | |Ki(u)| +|Ki(?) au | \K1(v)\+|Ki(t/a)| 5 
a jo t u? t/a v* 
Leo] ( ce | 1 , ( @o ik , | 
‘a | a {| Ki(u)| gy 4 EONS Kx(0)| iy 4 q /Kalt/a) \ 
ajo t u? oe t/a v t 
and since K,(t)/t is L?(0,~) so s| con du (T., Theorem 69 as before) and 
t u 


the integrand of the ¢ integral is the product of two functions of L?(0,~) and 
is therefore L(0,@ ). 
' Changing the order of integration, 


I(2) a af I" =|. { Ki (ut) -_ Ex] Xa(e/e) = K,(t/a)} a 
0 2 


1 x? 


1 v 





vas du «3s dv . . . i rs) 
“2 — {min (u,v/a)+min (1,1/a) —min (1,v/a) —min (u,1/a)} 
lv 


1 a 
|. Ky (at) K,(dt) dt 
0 t b 
as K;(t)/t is the kernel of a Watson transform. 
If we assume 0 < a < 1, then min (1,1/a) = min (1,v/a) = 1 for v 2/1. 
Therefore 


= min (a,b). 





since 





is 


WATSON TRANSFORMS 233 


I(a) =a |, su" tein (u,v/a) — min (u,1/a)}. 
a, 


1 


Also, if u < 1/a then u < v/a in our range of integration and so min (u,v/a) 
— min (u,1/a) = 0 and 


I(a) =a I. @((“22 4. |. 4? _ an “) 
Ye w@\Ji av a aJji v 
= a. 
A similar proof would show that J(a) = 1 if 1 < a, and so 
[PSA Sie 

o ft l 

Setting a = b/c andt = cu, 
|. Gi(cu) Gi(bu) 5 

o u 
and therefore G,(t)/t is the kernel of a Watson transform (T., Theorem 131). 

COROLLARY. The condition that F(x) be real may be replaced by the condition 
that k(4+-it) be bounded in (— @, ~). 

This follows from Corollary 1, Theorem 2. 

The following example shows that Theorems 2 and 3 do not hold for all 
functions F(x) satisfying (1) and that some extra condition must be imposed 
if F(x) may be complex. 

Let F(x) = ate~** where 0 < ama< x/2. Then F(x) € L*(0,~) and satis- 
fies (1). Also 


= min (1,ca). 


= ¢ min (1,b/c) = min (c,d), 


ioe) 


f(s) = | e~**x*-Idx = q'—*T(s) 
0 


and k(s) = f(1 — s)/T(1 — s) = a** or k(4+%) = a**. Thus |k(}+ i2)| 
= ¢ *™*) and so k(s)/(1 — s) does not belong to any integrable class in 


(} — i@,4-+ i). The arguments in Theorems 2 and 3 therefore break down 
completely. 


4. Examples. (i). Let F(x) = e~*. Then (1) is satisfied, f(s) = I(s), 
and k(s) = 1. Therefore K,(t)/t = 0 for t < 1, Ki(t)/t = 1/t for t > 1 and 
Gi(x)/x = /(2/x) sin x/x. 

In this case the result (b) of the formal analysis holds and 


G(x) = W/ (2) F(ix) ao ix) _ y/ (2) — 


is a Fourier kernel. 


(ii). If F(x) = y/ (2) ~ 
ll). x = * i+x 


2 , 
then f(s) = V/ (2) 3 , and the Mellin transform of G;(x)/x is 
COS 975 


WV (2)K cos $ xs/(1 — s) = 1/(1 — s). 




















234 F. M. GOODSPEED 


Therefore 





SG). | 0 O0<x <1), 


x 1/x (1 <x), 


and the Watson transform of which G;(x)/x is the kernel transforms A(t) 
into h(1/t)/t. 

In this case }4/(2/x) { F(ix) + F(—ix)} = 0 for all values of x except +i, 
and so the result (b) of the formal analysis does not hold. 

Theorem 3 gives 


Gifx) _1 ,. 1 2 (2) ~ 
x x im 2% ¢ — wrj/1+2 ds 





1 + (u + ix)? -_ 1+ (u+ = 
= — log —— —, = — lim am Fen ope 
Qrix «u—-0 1+(u—ix)” 2x u-0 1 + (u — ix)?*) 
M4 0 O0<x<1), 
~ life (1 <2). 
2 
(iii). If F(x) = W/ (=) wittion then all our conditions are satisfied and 
771+ 
2 2 
f(s)= W/ (5) cosec $s, k(s) = y/( ) sin 44s, K,(t)/i= W/ (2) —cos x)/x 
2 rT rg 


and Gi(x)/x = — log } Ss. 
ex \l — x| 
In this case the formal result holds as well with G(x) = 2/xr(1 — x). 
(iv). In general, if any function F(x) of the class L?(0,©) satisfies (1) then 
so does any Watson transform of F(x). 
For if the Watson transform with kernel M,(t)/t of F(x) is H(x), i.e. 
H(x) = <| F(t) M (xt) 
dx t 





dt, 


0 
then the transform of F(ux) is 
< | P(ut) A) yp 2 2 [” ry) MEM wy a1 (2). 
dx jo t dx Jo t u u 


Therefore, by the general Parseval relation for Watson transforms (a slight 
extension of T., 8.5.8) 





| F(x) F(ux)dx - 1 n)H(®) de 
0 uJ 0 u 
-| H(x)H(ux)dx 
0 
or | H(x)H(ux)dx = = i 
0 1+ u 





2 2 — 
Taking F(x) = - . an M(t) = =) -— cues we obtain 
. 771+ x t 7 t 


ino Fay 1 — cos xt 1, — 


° 1+) © 


dt. 
x dx 








in 








WATSON TRANSFORMS 235 


In this case h(s), the Mellin transform of H(x) is ['(s) tan $s, and applying 
Theorem 2 to H(x) we obtain k(s) = h(1 — s)/T'(1 — s) = cot 4as and 
Ky(t) a 1 og | +4. 
t at 





1 — x) 
Therefore (5) becomes 


H(x) =| Nog tt at 


1-f 
2 @ —rt 
= 3/ Sim 
ol-—f 
after integrating by parts and taking the final integral as a principal value 
att = 1. This second expression for H(x) can also be obtained from the first 
by the calculus of residues. 

The first expression for H(x) cannot be used in Theorem 3 for calculating 
G(x)/x as it has no meaning unless x is real. Using the second, however, 


we obtain 
Gi(z) y/ (=) 1 — cos x 
x ¥ x 


5. The three theorems above may be extended to include the case where 
there are two different functions involved in (1). 
THEOREM 4. Let F(x) and H(x) of the class L*(0,~) satisfy the equation 
io2) 
(11) | F(x)H(ux)dx = — 
0 l u 
and have Mellin transforms f(s) and h(s). Then f(s)h(1 — s) = I(s)T(1 — s). 
If k(s) = f(1 — s)/T(l — s) and I(s) = h(i — s)/T(1 — s) are bounded for 
s in the range (4 — i~, 4 + i) then (5) holds with K,(t) defined as in (4). A 
similar formula holds for H(x) with K,(t) replaced by L,(t) where 
Li(x)_ 1 tee I(s) 








(u > 0) 


—— x~“‘ds. 
}—10@ l — $§ 


x 2x1 

The functions K,(t)/t and L,(t)/t belong to the class L?(0,©) and are the 
kernels of conjugate Watson transforms. 

Functions G,(x)/x defined as in Theorem 3 and M,(x)/x defined in the same 

way with H(z) replacing F(z) exist and are conjugate Watson kernels with Mellin 


transforms / (2) f(s) cos 4xs and (2) h(s) cos 3s. 


This Theorem is proved in almost the same way as Theorems 1, 2, and 3, 
and so no proof will be given. 

In this Theorem it must be specified that h(s) and /(s) are bounded on 
( — io, } + im) and the condition that F(x) and H(x) be real is not suffi- 
cient for the sections of the Theorem corresponding to Theorems 2 and 3. 

This can be shown by taking F(x) = xe~* and H(x) = (1 — e~*)/x. These 








236 F. M. GOODSPEED 


functions are L*(0,-) and satisfy (11), but as k(s) = 1 —s no function 
K,(t)/t can be defined and the argument breaks down. The functions 


G(x) = 44/ (2) tree) + F(— ix)} = / (=) = sin = 
and M(x) = 4 4/ (=) { H(ix) + H(—ix)} = / (=) sin x/x 


obtained as in the initial formal argument are, however, conjugate Fourier 
kernels in a certain sense. In fact, if g(t) is a function such that tg(t) € L*(0,-), 


then its G transform is 
2 °°) 
V/ (=)| q(t)xt sin xtdt 
Tr 0 


r(x) 
xT,{ ig(t)} 
where T,{ tg(t)} is the sine transform of fg(t). Further, the M transform of 


r(x) is 
2\[° $j 
V(2)) 1T.{tg(t)} 9° ae = + xg(x) = 9(2). 
TiJo xt x 


Thus, in a certain sense, the original formal assertion (c) still holds even 
though our rigorous argument breaks down. 

Other extensions of Theorems 1, 2, and 3 are obtained by replacing equation 
(1) by slightly more general equations as in the following two theorems. 


- 


THEOREM 5. Let F(x) be a real function defined in (0,~ )such that 





| { F(x) }2x*—\dx <o@ (c > 0), 
0 
= irae T'(c) 
12) | F(x) F gee ee aoe , 
( : (x) F(ux)x° dx a4 a) 
and let f(s) be the Mellin transform of F(x) defined for R(s) = c/2. 
Then f(s)f(e — s) = T(s)T(e — s) (R(s) = c/2) 
and if Ms) @ 26 —S) (R(s) = 1/2) 
I'(c — cs) 
then k(s)kR(l — s) = 1 (R(s) = 1/2). 
If K,(t)/t is the Watson kernel derived from k(s) in the usual way, then 
(13) F(x) = x | K,(t*)e~* "dt. 
0 


Conversely, if K,(t)/t is any L* Watson kernel, then (13) defines a function 
F(x) that satisfies (12). 


The proof will not be given as the only difference between it and the proof 
of Theorems 1 and 2 is that the more general Mellin transform theory of T.., 
Theorems 71, 72, and 73 is used. 












er 


), 


/2) 
/2) 








WATSON TRANSFORMS 


Examples. (i). If k(s) = 1, then 
KO 40 (0 <t<1) 





t lft (l1<#) 
and this leads to F(x) = e~* which obviously satisfies (12). 


(ii). If Ki(x)/x = y/ (2) sin x/x then F(x) = y/ (2) «| e-** sin t°dt. 
rT 0 
This can be integrated in finite terms if c = 4, giving 


1 
F = —tz 
"= an 





as a solution of (12) with c = }. 


THEOREM 6. Let F(x) be a real function belonging to the class L*(0,~) and 
satisfying 
e-l 
(14) | F(x) F(ux)dx = “2 TO) 
0 (1 + u)° 

with c > O for all positive u. 


If f(s) ts the Mellin transform of F(x) and 





_ Kfi—s) 
ee 
(5+5-) 
then k(s)k(1 — s) = 1 for R(s) = 4 and 
(15) Pa) « — 2F |" Ki) pF ta 
dx 0 t 


where K,(t)/i is the Watson kernel derived from k(s) in the usual way. 


This Theorem is proved in the same way as Theorems 1 and 2 and so the 
proof will not be given. 
As a formal deduction from the formula for F(x) we obtain 
e-1{@° 


- e-1 
F(x) = "| tT e~**K(t)dt 
0 
where K(t) is a Fourier kernel. 
Example. If ae = /(2) : — , then (15) with c = 3 gives 


T 


2 2x 
F(x) = (2) a +x) + = 


as a function satisfying (14). 





Queen's University 
Kingston, Ont. 








MINIMAL SOLUTIONS OF DIOPHANTINE EQUATIONS 
L. HOLZER 


OuR aim is to prove: If a, b, c are positive integers, if ab> 1, if a, b, c are relatively 
prime in pairs, and all are free of squares, if — ab is a quadratic residue of c, be 
of a, ca of b, and if F(x, y, 2) = ax*® + by’ — cz*, we have non-trivial solutions of 
(1) F(x,y,2) = 0 


with the inequalities |x| <Vbc, |y| < ca, |2|< Vab. 

It is clear that among the three inequalities we need only prove the third, 
since the two others necessarily follow. 

If ab = 1 there is always a solution with z = 1. This case is known and 
will not be considered. The inequalities for x and y still hold, except that if 
c¢ = 1 one sign < must be replaced by =. 


I 


LEMMA 1: If @;,@2,43 are any integers, f(x,y,z) = asx + Gay + a2, there are 
integers u,v,w not all zero with |u| < Vv be, v| < V ca, lw| < V ab, f(u,v,w) =0 
mod abe. 

Proof: Putting x = 0,1,...,[Wbc] (the bracket signifies the greatest in- 
teger), y = 0,1,..., [Vv ca], i ae [Vad], we have more than abc 
numbers f(x,y,z). Therefore we must have a pair of triples (x1,71,21) (x2,¥2,22) 
with f(x1,¥1,21) = f(x2,y2,22) mod abc. Putting u = x2 — x; etc., we have 
f(u,v,w) = 0 mod abc. 


LemMA 2: There are numbers u,v,w satisfying the inequalities \u| < V be, etc. 
and F(u,v,w) = 0 mod abc. 

Proof: |1f A? =c/b mod a, B® = a/c mod b, C? = —b/a mod c we put 
f = ab(x — Cy) + bc(y — Az) + ca(z — Bx) and have u=Cwv mod ¢, 
v = Aw mod a, w = Bu mod Bd, F(u,v,w) = 0 mod abc. 

We use the abbreviation F(u,v,w) = F,. If there is any other triple (w’,v’,w’) 
essentially different, i.e. not (—u, —v, —w), we write F; = F(u’,v',w’), F: = 
auu’ + bov’ — cww’. 

LemMA 3: We have only the cases F, = 0 or abc, likewise F3, whereas F, can 
be = 0, + abc, + 2 abc. 

Proof: The proof for F; (and F;) results immediately from the inequalities 
and F; = 0 mod abc. The congruences u = Cv mod c, u’ = Cv’ mod ¢, etc. 
imply auu’ + bvv’ = 0 mod ¢, etc., F2 = 0 mod abc. The inequalities show 
that F, has one of the five values. 


Received February 9, 1949. 














MINIMAL SOLUTIONS OF DIOPHANTINE EQUATIONS 239 
The theorem is proved if F; or F; = 0. Therefore we suppose F; = F; = abc. 
For the rest of the demonstration it is significant that the cases w = 0 and 

w = w’ need not be taken into consideration in the sense that they either 

result in an equation (1) with |z| < V ab or are impossible. 


LemMA 4: The case w = 0 need not be taken into consideration. 

Proof: We have au® + bv*® = abc, i.e. au? = 0 mod b, and as a and 5 are 
prime to each other u? = 0 mod 6b, u = 0 mod 5, for d is free of squares. Writ- 
ing u = bu, v = av;, so that u,v; are integers, we have av,* + bu;* = c, an 
equation (1) which satisfies |z| = 1 < V ab. 


Lemma 5: The case w = w' need not be taken into consideration. 
Proof: We have in their turn: 


(i) F, = abe. At once a(u —w’)?+d0-—v)P=0,ie, u=w, v=’. 

The two triples would not be different from each other. 
(ii) F, = —abe. We have a(u — u’)* + b(o — v’)* = 4 abc, and analog- 

ously as in the proof of Lemma 4 we get u — u’ = buy,0 — v' = av, avy? + du? 


= 4c, a solution of (1) with z = 2 < V ab except for the cases a = } = 1; 
a = 1, b = 2 or 3 and vice versa. In all these cases there exists a solution 
with z = 1. 


(iii) F, = 0. We get a(u —u’)? + d(v —v’)* = 2abe. As above there is 


(2) av; + bu;? = 2c. 
Expressing u’,v’ by ,v,%;,0; and substituting in F, = 0 we get 
(3) uu, + 11 = C. 
Eliminating u; from (2) and (3) we obtain the quadratic equation for », 
(4) (au? + bv*)v;? — 2bcevv, + (bc? — 2cu*) = 0, 
whose discriminant must be a square. This gives easily 
(5) 2(au*? + bv*) — abc = cf* (tan integer). 
Taking in account F,; = abc, we obtain from (5) 
ab + 2u* = #. 


We can consider w and ¢ as positive. We have w <Vab,Vab < t < V3ab. 
We put 


(¢ + w+/2) (—1+ V2) = T+ Uv?2, ie, T=—t+2w, U=t—w. 


We have U>0O. The boundary values ¢ = V3ab and w = Vv ab, t = Vab 
and w=0 give U = Vab(V/3 — 1), U = vV/ab, respectively, whereas the 
relative minimum calculated by differentiating 


t — w — 2 ( — 2w* — ab) 
partially with respect to ¢ and w gives 
l—Mee -14+S6 <6 to feo Van u=4/%. 











240 L. HOLZER 


In any case we have U < Vab. We have also 
(6) T? + ab = 2U’". 

With n = norm with respect to the field R(‘V —ab) (R is the rational field) 
(2) and (6) can be written 


(7) = n(ar +V — abu) = 2c, 

(8) n(T +V — ab) = 2U°. 

With the abbreviations 

(9) 2X =17T — bu, 2Y¥Y = av, + uT, 
we get by multiplying (7) and (8), 

(10) aX? + bY? = cU’. 


If a and 5 are odd or a is even the numbers X and Y are integers. For a,b 
odd gives T odd and u; = v, mod 2 (see equation (2)), and an even a gives T 
and “, even (equations (2) and (6) taking into account b odd). 

If 6 is even we get a similar conclusion by interchanging a and b. 

(iv) F: = 2abc. We should have a(u — u’)? + b(v — vo’)? = —2abe which 
is impossible as a,b,c are positive. 


(v) F, = —2abc. We get ina similar manner as in (iii), 
(11) av;? + bu; = 6c, 
(12) uu, + i, = 3c, 


the equations (2) and (3) with 3c instead of c. The number 2 (au? + bv*) — 3abc 
must be 3c times a square. We get 
(13) 3f — 2w* = — ab (tan integer). 
Now let n be the norm with respect to the field R(V/6). We have 
2(w* — 32/2) = ab, n{(2 — V/6)(w + tvV/6/2)} = — abd. 
We consider ¢ and w as positive. If t < V ab, w< V ab, U = |t—w| < V ab, 
with T = 2w — 3t we have the equation analogously as before 


(14) T? + ab = 6U’. 
Combining (11) and (14) as before, we have with the abbreviations 
(15) 17 —- bu, = 6X, av, + uiT = 6Y, 


the equation aX? + bY? = cU’. 

Now all depends on the fact that the left sides of the equations (15) are 
divisible by 6. The demonstration as above that 6X and 6Y are even, holds. 
With 6X = X’, 6Y = Y’ we have three cases: 

(i) @ =Omod3. Then T = 0 mod 3, bu; = 0 mod 3, u, = 0 mod 3 as 
a and b are prime to each other, X’ = Y’ = 0 mod 3. 

(ii) b&b =O mod 3, T = 0 mod 3, X’ = 0 mod 3, 1, = 0 mod 3, therefore 
Y’ = 0 mod 3. 

(iii) If a,b both are not divisible by 3 we conclude: As (—ab) is quadratic 








bc 


are 


ds. 


as 


ore 


tic 





MINIMAL SOLUTIONS OF DIOPHANTINE EQUATIONS 241 


residue mod 3 the rational prime number 3 is the product of two different 
conjugated prime ideals 7,7’ in R(V —ab). It is only necessary to change per- 
haps the sign of T and u so that we have 


—-4u 
v1, + V—ab ~= 0 mod j, 
T+V-—ab =0 mod /. 


yr 


Then X’ + V —ab — = 0 mod 3, X' = Y’ = 0 mod3. 


Suppose F; = abc. We divide the set of triples (u,v,w) in categories. 

(i) abc odd. We have four categories: (1) u,v,w odd; (2) u odd, v,w even; 
(3) and (4) analogously as (2) with v,w instead of u. 

(ii) If abc is even we alter the letters for the moment as the sign of the co- 
efficients is now of no importance. Let an equation LX* + MY? + NZ* =0 
be given with L even, therefore M, N odd. We have nine categories: u even, 


therefore v,w must be odd. For if v,w were even abc would result = 0 mod 4. 
The categories are (1) v= w=1 mod 4. (2) v= —w=1 mod 4. (3) 
—v=w=1 mod 4. (4) v=w= —1 mod 4. If u is odd we have the same 


four categories. A ninth will be v,w even. 

Let us return to our original designation. In any case if two triples are of 
the same category we have u = u’,v = v', w = w’ mod 2. In the case (ii) we 
have F, = F, mod 4. In any case we have F; = +abc. With the integers 

uF u’ 





U = 


» Cte. 


we have |W| < ‘Vab and F(U,V,W) = 0. 


Ill 


We exclude the case ac = 1, b > 1, for the solution (1,0,1) satisfies our main 
theorem. In the same way bc = 1, a > 1 is excluded. Therefore bc,ca,ab are 
not squares. All depends on the question: Are there different triples of 
the same category? We put all the pairs of triples (x1,91,21) (x2,¥2,22) (Lemma 1) 
with z, > 2; in the drawer ze. We have (1 + (Vv ab]) drawers, but on account 
of Lemma 4 we can consider the drawer z2 = 0 void. How many pairs of 
triples do we find in the drawer with the greatest number of pairs? 

We first prove two further lemmas. 

LemMaA 6: We consider H elements A,... by which a certain number of pairs 
of different elements is formed and arranged in classes according to the following 
rules: 

(1) Each element occurs in at least one pair. 











242 L. HOLZER 


(2) The pairs in which any element occurs are all in the same class. We write 
class (A). 

(3) If there are any pairs (A,A’) and (A,A") the pair (A’',A”) exists and be- 
longs to the class (A). 

(4) The arrangement (A,A’) or (A’,A) in a pair is of no importance. 

Then there are at least H/2 pairs. 

Proof: At first we form all pairs of the class (A) where A is any element. 
The g 22 elements which occur in pairs of the class (A) form g(g—1)/2 pairs, 
all existing and belonging to (A), therefore at least g/2 pairs. 

If these are not all the pairs, then we have an element B not occurring in 
the pairs of (A). If g’ elements occur in the pairs of (B) we have at least g’/2 
pairs in the class (B). 

Finally we see that, since 

H=gt+e +e’ t+... +2, 
the number H’ of the pairs satisfies 
H’ > g/2+¢/2+...+ 29/2 = H/2. 

Lemma 7: If S is any of the numbers bc, ca, ab and S + t is the first square 

surpassing the integer S, we have 


VS Fi> VS4="5. 


Proof: We have S 2 t (S =t only in the case S = 2), therefore 25 S> 20S 
4t 4? 3 
] pi 
+4t. Multiplying by ¢/(25S) we get > — 5 + 55’ S+t> (v8+553)' 
hence the theorem. 
We have at least 


H = ((Vbc] + 1) ((Vca] + 1) ((ad] + 1) — abc 
triples (x1,71,21) which occur in any pair. According to the Lemma 8, (H/2) is 
a lower bound for the number of pairs of triples. 
According to Lemma 5 we can suppose that in a drawer all pairs have differ- 
ent values z;. For, the case w = w’ need not be taken into consideration. 
Now we use Lemma 7. If (bc + P), (ca + Q), (ab + R) are the first squares 
surpassing the numbers bc, ca, ab, we have the following important inequality: 


be 70-\( vai -*) 
b a se 
a (v - 2 \ ven TSA sa 


> = (Pa + Ob + Re) > 26/5. 





We have less than ./ad drawers; the drawer with the most pairs contains 
more than c/5./ab pairs. Supposing this number to exceed 5, so that 
c > 25+/ab, we are sure to have at least 5 pairs in a drawer. 

We have at least 5 different pairs of triples (u™ vo ,w™), (au vo w), ete. 
All these pairs can be normed, e.g. by w'” > 0. These we can arrange into 





) is 
fer- 


res 
ity: 


ains 
that 


etc. 
into 





MINIMAL SOLUTIONS OF DIOPHANTINE EQUATIONS 243 


pairs (u,v,w) (u’,v’,w’). As there are at most nine categories, at least one pair 
consists of two 1 triples of the same category. 
If c > 25+/ab we have solutions of (1) with |z| < +/ab. 


IV 


The disagreeable restriction c > 25+/ab can be removed easily. We require 
the following theorem from the theory of numbers: 


THEOREM: In a field let j be any integer ideal, a a number prime to j. There 
are infinitely many prime ideals (x) of the first degree with x = a mod j. 

This theorem, the generalized prime number theorem, was found by E. 
Hecke and published in the Mathematische Zeitschrift, vol. 1 (1918), p. 375 
(special case) and vol. 6 (1920), p. 38. A simple proof founded on Takagi’s 
class field theory was given by H. Hasse in the Jahresbericht der Deutschen 
Mathematikervereinigung, vol. 35 (1926), p. 32. 

According to this theorem there are an infinity of principal prime ideals (7) 
of the first degree, with « = 1 mod 8ab, in the field R(./—ab) prime to c. 
The conjugated prime ideal x’ and the norm n(x) = p (a rational prime num- 
ber) satisfy the same congruence. If p > 25+/ab, then cp > 25./ab. Suppose 
—ab = 2%, g = O or g = 1, t a negative integer. As p is a quadratic residue 
of t and p = 1 mod 8, so also ¢, —1 and 2 are quadratic residues of p. The 
number —ab is a quadratic residue of p, therefore of cb. As the numbers 
bc and p are quadratic residues of a, so also is the number bcp. In the same 
manner we find cap to be a quadratic residue of b. 

Therefore there will be solutions of 

ax® + by® = cpz* 
with |z| < +/ab. We have 
b = 2 

a n(x + =a” pez’, 
and as ax* + by* = 0 mod p, we can get (by changing perhaps the sign of y) 
that the expression in the norm is = 0 mod 2’. We have 

n(x) = n(r + s\/—ab) = p, e = 7 + 5\/—ab 

and multiplying, 


a n{(rx + bsy) + /—ab(—ry/a + sx)} = pcs’. 


Both terms rx + bsy and —ry/a + sx are divisible by ». With the abbre- 
viations rx + bsy = pX, —ry + asx = pY we have 


aX? + bY? = cz’, with |z| < abd. 
V 


REMARK 1: The bound for |z| is exceedingly narrow, and apparently can- 
not be surpassed easily by any general one which is lower. We have, e.g., 











244 L. HOLZER 


the equation 157 x* + 3y* = z* with the minimal solution (1,9, 20 = [+/471]—1). 
REMARK 2: If a has a quadratic factor, so that a = a’r® where a’ is free of 
squares, we have a solution (x’,y’,z’) of 


a’x*? + by? — cz? = 0 (a’be > 1) 
with |x’| < be, ly'| < ca’, |s'| < V/a’b. The solution (x’,ry’,rz’) of (1) 
satisfies the inequalities of our main theorem. 
We can proceed similarly if 6 and c have any quadratic factor. 


REMARK 3: Writing g(x,y) = ax* + 2bxy + cy’, we consider 
(16) g(x,y) = m2’. 


We suppose a,c,m positive integers, ) an integer, d = 5? — ac negative, d’ 
(positive) the greatest factor of d free of squares, m prime to d. Further we 
assume a,b,c to be co-prime (i.e., to have the greatest common divisor 1). We 
can replace the classical binary form g(x,y) by any equivalent one. As g 
represents an infinity of prime numbers, and there are always equivalent 
forms whose first coefficient is any number represented by the form prime to 
the discriminant d, there is no restriction of generality in taking the first 
coefficient to be a prime number not dividing dm. 
If (16) has solutions, so has the equation 


(17) u? — dv? = amz2’, 


which we prove very easily by multiplying (16) by a, and putting ax + by = u, 
y =v. But also every solution of (17) gives a solution of (16), since we have 
(u/v)* =d mod a, b? = d mod a; and since a is a prime number, we have, after 
changing the sign of v if necessary, u/v = b moda, u = ax + by, g(x,v) = m2? 
Thus the equation (16) has solutions with |z| < |./d| = /-—d if d’>1 
and these solutions are non-trivial. Of course, we assume the other conditions 
mentioned above too. If d’ = 1, there are solutions with |z| = |/d|. 


Vi 


Mr. Aubry gave, in Sphinx-Oedipe, vol. 8 (1913), p. 150, the following 
bounds: The equation pX? = Y? + rZ* has for r > 0 solutions with |X| < 
V21r/3. If r is negative he gives the bounds |X| < Vr, |Z| < Vp, |Y| 
< ~/-—2rp. Our bound for Y is better, namely |Y| < /—rp. 





Graz, Austria 


: 
’ 
> 











HELLY’S THEOREMS ON CONVEX DOMAINS AND 
TCHEBYCHEFF’S APPROXIMATION PROBLEM 


HANS RADEMACHER anp I. J. SCHOENBERG 


1. Introduction. Professor Dresden called to our attention the following 
theorem :! 


If Si, So,...,Sm are m line segments parallel to the y-axis, all of equal 
lengths, whose projections on the x-axis are equally spaced, and if we assume 
that a straight line can be made to intersect every set of three among these segments, 
then there exists a straight line intersecting all the segments. 

This theorem was conjectured by M. Dresher; a first proof, unpublished, 
was communicated to us by T. E. Harris. Wide generalizations are possible. 
Dr. Harris noticed that we can dispense with the equidistance of the lines 
carrying our segments. We shall see in a moment that the equality of the 
lengths of the segments is likewise a superfluous assumption. A further gener- 
alization, also due to Dr. Harris, is as follows: The intersecting straight lines 
can be replaced by general parabolic curves 
(1) y = age”™ + ayx™ = +... +0, (ng m — 2); 
again, if each set of m + 2 among our segments can be cut by such a parabola, 
then all may be simultaneously intersected by one such curve. 

In this note we wish to point out the close connection of this problem, and 
of the more general problem of best approximation in the sense of Tchebycheff, 
with two remarkable theorems on convex domains, due to E. Helly, which 
may be stated as follows; 


THEOREM 1 (Helly). Jf Ci, Co,..., Cm is @ finite collection of convex sets, 
which peed not be closed or bounded, in the n-dimensional Euclidean space E, 
(m2 n+ 1), such that every n + 1 among the sets have a common point, then 
all m sets have a common point. 

TuHEoreM 2 (Helly). Let {|D} be an infinite collection of closed and convex 
sets D, which need not be bounded, in E,,, such that every n + 1 among the sets 
have a common point. Then all the sets D have a common point, provided there 
exists a finite subcollection D’, D’’,...,D™, (k2 1), of elements of |D}, such 
that their intersection A = D’'D”,..., D\” is non-void and bounded. 

Let us first see how very directly the Dresher-Harris theorem may be de- 
rived from Helly’s Theorem 1. Let 

S,: x=x,, bS yO, 
(»=1,...,m; x1 <%2<... <Xm, m2 3), 

Received March 22, 1949. 


1See [8], p. 4, where the theorem is stated without proof. 
*See [4], [6], and [7]. 


245 











246 RADEMACHER AND SCHOENBERG 


be the m segments of the theorem. Consider the totality of lines y = agx + ay 
intersecting the vth segment S,; the requirement of intersection is expressed 
by the inequalities 

(2) b,& aor, + aX ¢,. 


In the plane E, of the variables (ao,a,), the double inequality (2) defines a 
parallel strip of slope — x,. This strip, which we denote by C,, is certainly a 
convex set in E,. Let us now consider the collection of m convex sets C;, Co, 
.-+» Cm, corresponding to the segments S;, S2,..., Sm. From the assumption 
of the Dresher-Harris theorem we know that every three among these sets 
have a common point. Since all assumptions af Helly’s Theorem 1 are satis- 
fied in E,, we may conclude that all m sets C, have a common point, hence all 
m segments S, are intersected by a line. This proof clearly extends to the case 
of the parabolic curves (1) by applying Helly’s Theorem 1 in the space E,4: 
of the variables (do, a1, . . . , dn). 

As a second example of the versatility of Helly’s ideas we shall again use 
Theorem 1 to give a new proof of the following separation theorem.’ 

THeorem OF Paut KircuBercer. Let S = {P} and S’ = { P’} be two finite 
sets of points in E,. We shall say that a hyperplane x separates strictly S from 
S’, if all points of S are on one side of x, while all points of S’ are on the other 
side, with none of the points lying on x. Such a strictly separating plane x exists 
if and only if the following condition is satisfied: For every set T of n+2 
points chosen arbitrarily from S and S’, there should exist a hyperplane x7 which 
separates strictly the S-points of T from the S’-points of T. 

The necessity of the condition is obvious; to prove its sufficiency let us 
assume that it is satisfied and prove the existence of a strictly separating plane. 


We introduce in E, a coordinate system (x;,...,%,). In the space E,4, of 
the variables (a;, da, . . . , @n41), and corresponding to each point P = (x1,...,Xn) 
of S, we define an open half-space Hp by the inequality 

(3) Hp: 3%. + Got2 +... + Onn + Gnai > 0. 

Likewise, corresponding to each point P’ = (x’;, x’2,...,x’n) of S’, we define 
in E,41 an open half-space Hp by the inequality 

(4) Hp»: ax"; + Axx’ » + eee _ AnX' n aa Qn+1 < 0. 


*See [5], where a proof of this theorem requires nearly 24 pages. The theorems of Kirch- 
berger and Dresher-Harris are not unrelated. The following new generalization of the Dresher- 
Harris theorem indicates the connection: Let S be a finite set of points P; = (x;,y;) in the plane 
and let S’ be a second set of points P’; = (x’;,y';). We say that a line y + ace + a1 separates 
the sets S and S', if y¥;= acx; + a: for all points of S, and y'; S ao x’; + a: for all points of S’. 
There exists a line y = acx + a; separating S from S’ if and only if the following condition is 
satisfied: For every set T of three points chosen from S +S’ there should exist a line separating 
the S-points of T from the S’-points of T. We obtain the Dresher-Harris theorem as a special 
case of this theorem if we take S to be the set of upper endpoints of the segments S,, while S’ 
is the set of their lower endpoints. A proof of this generalization by means of Theorem 1 is 
obvious and so is its extension to parabolic curves (1). 





~~. = 


ores 





TCHEBYCHEFF’S APPROXIMATION PROBLEM 247 


In terms of the finite collection {Hp} + {Hp} of open half-spaces of En+1, 
Kirchberger’s condition means that every m + 2 among these convex half- 


spaces have a common point. By Helly’s Theorem 1 we conclude that there 
is a point (a1,..., @n41), with }>|a,| > 0, common to all of these half-spaces.* 
I 


The corresponding plane ayx; + ... + GnX%n + Gn41 = O separates strictly S 
from 5S’, and the proof is concluded. That Kirchberger’s theorem becomes 
false if the number m + 2 is replaced by m + 1 is seen by the example of the 
set S being the set of m + 1 vertices of a simplex, while S’ has only one element 
namely the centroid of the simplex. These two sets S, S’ cannot be separated, 
though the points of S and S’ occurring in any (m + 1)-tuple can be separated. 
We also wish to remark that the theorem becomes false if the sets S, S’ are 
allowed to be infinite. Indeed, if in EZ, we take S to be the exponential curve 
y = exp x while S’ is the x-axis, then clearly every n + 2 = 4 points of S + S’ 
can be strictly separated by a line, but not the sets S, S’. 

Concerning Kirchberger’s theorem the following remark is of interest. Let us 
replace the “‘strict separation” of the theorem by “‘separation”’ in the weaker 
sense that points of S or S’ are also allowed to lie on the separating plane x. 
We may then state the following proposition; 

Kirchberger’s theorem in E,, remains true if in its statement ‘‘strict separation” 
is replaced by “‘separation”’ in the above wider sense, provided we replace in the 
theorem’s condition the number n + 2 by 2n + 2. Also no number smaller than 
2n + 2 will do. Moreover the sets S and S’ may now also be infinite. 

In order to prove this new result let us define in Z,4:, as we did above, 
the collection of closed half-spaces 


(5) Hp: a4X1 + eee ao AnXn + Onair 0, for P = (x1, eees Xn) c S, 

(6) Hp: ayx’s +... + Gnx'n + GngiX 0, for P’= (x'1,...,2'n) € S. 

We wish to prove the existence of a point (a:,...,@n4:), with }>\a,| > 0, 
I 


which is common to all of these half-spaces. Helly’s theorem does not help 
us here any more, but we can apply the following remarkable theorem of L. L. 
Dines and N. H. McCoy 

A finite or infinite collection of closed half-spaces in En+:, each half-space 
having the origin 0 on its boundary, do have a common point different from 0, if 
every set of 2n + 2 among our half-spaces have a common point different from 0. 


This theorem assures the existence of a point (a:,...,@n41)# (0,... ,0) 
such that the inequalities (5) and (6) hold for all P € S, and all P’ € S’, 
ere n+1 


‘Actually we know only that = |a,| > 0. However, all the points of sufficiently small 
1 
spherical neighbourhood of the point (ao, . . . , @n41) likewise satisfy all conditions and among 
n 
them we can certainly find one for which 2 |a,| > 0. 
1 


‘See (3], pp. 61-63; see also [2], pp. 962-963, where there are also references to a paper by 
C. V. Robinson. 











248 RADEMACHER AND SCHOENBERG 


respectively. This would mean that ax; + ... + GaXn + Gn41 = 0 isa separ- 


ating hyperplane as soon as we know that >-\a,| > 0. This last point, how- 
1 


ever, is clear, for a; . =a,=0 and (5) and (6) would imply a,4:2 0, 
Qn4i< 0, hence a,4; = 0, which is impossible. 

The following example shows that the number 2” + 2 of the new version 
of Kirchberger’s theorem may not be replaced by 2” + 1: Let S consist of 
the m + 1 vertices P;,..., Pn+: of simplex o, and let S’ consist of the same 
m+1 points P’;,...,P’a4:, with P’, = P,. Choosing 2m + 1 points of 
S + S’ amounts to leaving out P,, or perhaps P’,. The remaining 2n + 1 
points are clearly separated by the (m — 1)-dimensional face of the simplex ¢ 
which is opposite to the vertex P,. Hence the conditions of the theorem are 
verified for every set of 2n + 1 points, while there is no hyperplane z separat- 
ing S from 5S’. 

The connection of Helly’s theorems with the idea of Tchebycheff approxi- 
mation, i.e. the consideration of the minimum of a maximum, suggested to us 
a new proof® which we claim to be the first proof of Helly’s theorems to be 
entirely geometric, in the sense that every single one of its steps has an intuitive 
geometric meaning. This proof is given in the first part of the paper. The 
second and last part is devoted to an application of Helly’s theorems to 
Tchebycheff’s approximation problem. 


A New Proor or HELLY’s THEOREMS 


2. On proximity points of convex domains. We shall see that the main 
point in proving Helly’s theorems is to prove Theorem 1 for the special case 
when the convex sets C:,...,C,» are also closed and bounded. A closed and 
bounded convex set in E, will be referred to as a convex domain. Let D,, 
D2, ..., Dm be such convex domains. If Q € E,, we denote by d(Q,D,) the 
distance from the point Q to the domain D,. The point function 


f(Q) = max d(Q,D,) 


is evidently non-negative and continuous throughout E,. Since f(Q)— © as 
Q— o, the function f(Q) assumes somewhere its absolute minimum value. 

DEFINITION. An absolute minimum point P, of f(Q), will be called prox- 
imity point of our domains D,;,...,Dm. It has the property 


max d(P,D,) = min max d(Q,D,). 
y Q y 


*Three earlier proofs have come to our attention: By J. Radon [7], E. Helly [4], and D. 
K@6nig [6]. Radon’s proof, which is the shortest, is analytic. The proofs by Helly and Konig, 
essentially equivalent to each other, are geometric. However, all three proofs use the method 
of mathematical induction, a fact which seems to obscure the intuitive background of the 
results. Our proof uses the metric of E, and is therefore related to the ideas of Menger and 
Blumenthal (see [2]). 


al 





— 


a. ® 





TCHEBYCHEFF’S APPROXIMATION PROBLEM 249 


The minimal value f(P) = min f(Q) will be called the proximity of these domains 
and denoted by the symbol Prox (D,,..., Dm). 

Evidently we have Prox (D,,..., Dm) = 0 if and only if our m domains 
have a point in common and if this is the case, any point of their intersection 
is a proximity point.’ 


3. A characteristic property of proximity points.* The following theorem 
expresses a fundamental property of proximity points. 

THeEoreM 3. Let D;,..., Dm, (m2 2), be convex domains in E,, having no 
common point. Let P be a proximity point of these domains, the proximity 
p = Prox (D,,..., Dm) being necessarily positive. Let P, € D, be such that 
PP, = d(P,D,), hence we have p = max PP,. Then there are s among the m 


normals PP, from P to our domains, PP;, PP:,..., PP, say, such that 
(i) 22 sS n+. 
(ii) PP, = PP, =... = PP, = p. 
(ili) The points P;, Po,...,P, are the vertices of a (s — 1)-simplex o, which 
simplex contains the point P in its (s — 1)-dimensional interior. 
(iv) The corresponding s domains D,, Dz, ..., D, have no common point. 


The last conclusion is for us the important one. In fact we shall use this 
theorem only in the following abbreviated form: 
COROLLARY. If m convex domains, of E,, have no common point, then some 
s among these domains have no common point, where 2< s& n + 1. 
Proof of Theorem 3. Suppose that 
PP, = PP, =...=PP,= p, PP,:si < p,...,;PPa <p (h& m). 
Clearly h2 2; for if h = 1, then P could not be a proximity point. Indeed, 
then max d(P,D,) could be diminished below its present value p by moving P 
slightly along PP, towards P. 
Consider now the convex hull of the points P;, . . . , Px, which we denote by 
K = K(P,,..., Px). 
We claim that P € K, for otherwise let PP’ be the shortest distance from P to 
K; we could then, again as before, diminish al/ distances PP, = d(P,D,), 


(vy = 1,...,), by moving P slightly along PP’ towards P’. Hence indeed 
P€ K(P,,..., Pa). 


We shall now use the following known result:* If P is a point of the convex 








7As illustrations of the notion of a proximity point we mention the following two proposi- 
tions of elementary geometry: Let A, B, C, be the vertices of an acute-angled triangle in the plane. 
The proximity point of the three poinis A,B,C, is the circumcenter of the triangle. The proximity 
point of the three segments BC, CA, AB, is the incenter. 

®The properties (i), (ii), and (iii) of the point P, as described in Theorem 3 are indeed charac- 
teristic for a proximity point, a fact which we mention without proof because we do not use it. 

*See [1], Satz IX on p. 607. This theorem is easily derived from a well known result of 
Caratheodory to the effect that every point of K(P:, ..., P,) isa centroid with positive masses 
of at most m + 1 points among the P,. 











250 RADEMACHER AND SCHOENBERG 


hull K(P,,..., Px), then either P coincides with one of the points P,, or else we 
can find a simplex oc, of dimension s — 1 ranging from 1 to at most n, having 
as vertices only points from among the points P,, and such that P is in the (s — 1)- 
dimensional interior of «. Returning to our proximity point P, we remark 
that P cannot possibly coincide with any of the points P,, since PP, = p > 0, 
(» = 1,...,%). Therefore the above result assures us of the existence of a 
simplex of vertices P;, P:,..., Ps, say, satisfying the conditions (i), (ii), and 
(iii), of Theorem 3. 

There remains to prove the fourth and last statement of the theorem to the 


effect that D,Dz,...,D, = @. We consider the s unit vectors 

a; = PP;/PP; (¢=1,...,5), 
and the s half-spaces H; defined by 

= 

(7) H;: PQ.a:2 p fo ae 
Since D; C H,, it is sufficient to show that 
(8) Aifi,...H, = ¢. 
Suppose (8) were false and let Q € H;, (¢ = 1,..., 5); then all inequalities (7) 


hold. However, since P is in the interior of ¢, we have a vector relation of 
the form : 


s ~ 
(9) a Kia; = 0, with all «; > 0. 
1 


+ 


But then, on multiplying (9) scalarly by PQ, in view of (7), we obtain 


0 = De(PQ.a)2 Ceie = pL, 


which clearly contradicts the positivity of p and «x;. This completes our proof. 


4. A proof of Helly’s Theorem 1 concerning a finite collection of convex 
sets. We distinguish two cases. 

First case: We assume that the m convex sets C, of the theorem are also 
closed and bounded, an assumption which we emphasize by writing C, = D,, 
(vy = 1,...,m). This case is now immediately disposed of, for if we assume 
to the contrary that our m convex domains D, have no common point, then, 
by the Corollary of Theorem 3, some s among them (2< s< n + 1), have no 
point in common, a fact which contradicts the assumption of Theorem 1 to 
the effect that every m + 1 domains have a common point. 

Second case. We assume that the C, are convex sets which need not be 
closed or bounded. By assumption every combination C,, Cy, ..., Ci, 0 
n + 1 distinct sets have a common point. Let such a point be Aj,. x... ., ™ 
and let it be regarded as a symmetric function of its m + 1 distinct subscripts. 
Corresponding to each C; we now define the convex domain 


D; = K(Aj,5, ..., in) 


Qa 








TCHEBYCHEFF’S APPROXIMATION PROBLEM 251 


which is defined as the convex hull of the » “ ') points A;, ;,..., 4, where 
ju-++sjJn runs over all combinations of m among the m — 1 numbers 
1,...,¢~—1¢+1,...,m. Since A;,;,...,5, € Ci, we have 

(10) DC CG, 


because C; is convex. Every set of » +1 among these domains, D,,, D,,, 
..+,Dj,, say, have a point in common, namely the point Ag, «,.. . , i, By 
the first case already established we conclude that all D; have a common point. 
In view of (10) we now obtain the desired conclusion to the effect that the 
sets C; have a point in common. 


5. A proof of Helly’s Theorem 2 concerning an infinite collection of closed 
convex sets. Let {D} be the given infinite collection of closed convex sets. 
By Theorem 1 we know that the elements of every finite subcollection of { D} 
have a common point. Consider the new collection {D* = AD}, where A is 
the non-void and bounded set defined in the statement of Theorem 2, while D 
ranges over the given collection {D}. The elements of { D*} have the follow- 
ing properties; 

(i) They are closed, bounded and non-void convex sets. 

(ii) The elements of every finite set of D*’s have a common point. The 
desired conclusion to the effect that all the D*’s, and therefore also all the 
D's, have a common point now follows from the following general Theorem: 


THEOREM OF F. Riesz: If a collection { A} of bounded and closed sets in En 
has the property that the elements of every finite subcollection have a common 
point, then all A’s have a common point. 


An APPLICATION OF HELLY’s THEOREM TO TCHEBYCHEFF’S 
APPROXIMATION PROBLEM" 


6. Approximations to discontinuous functions. We first derive somewhat 
differently a classical result concerning the following finite problem: Let there 
be given x + 1 points 


(11) (X»,¥») @wGl,...,%t Se< Macs... <Bads 
we wish to determine the polynomial 

(12) P(x) = agxe™ + ayx™? +... + dae 

which minimizes the expression 

(13) max ly, — P(x,)|. 

We need the following lemma: [If the real variables (uo, u1,...,Un) are connected 


by the linear relation with real constant coefficients 


‘°F or the first published proof of Riesz’s theorem see [6], p. 210; it is an almost immediate 
consequence of the Heine-Borel theorem. 
"An excellent reference to Tchebycheff’s approximation problem is [9], Chapter VI. 











252 RADEMACHER AND SCHOENBERG 


(14) bo Uo + Oi uy +... + Onn = (bob, ... bn # 0), 
then the expression max |u,| has the minimal value 


lc| 

Poot Ibi... + 1Dal ” 
which is reached for just one set of values u, = u*, given by 
(16) u*, = psgn (cb,) (» = 0,..., 2). 

We lose no generality in assuming that c > 0, for if c = 0 the result is 
trivial and if c < 0 we may multiply both sides of (14) by — 1. In view of 
(15), the relations (16) indeed define a solution of (14); (16) also imply that 
\u*,| = p, hence p = max |u*,|. Let now (u,) be an arbitrary set satisfying 
the two relations 


(15) 





Xb,u, =c and max |u,|< p. 


This set (u,) must be of the form 
u, = ¢,u*, = «,p sgn (b,), where — 1 «,< 1, (» = 0,...,1). 

Now c=)> bu, = > dye, p sgn (6,) = & «e,plb,|, or ¥ e€,p\b,| = c. 
In view of (15) or >> p\b,| = c, the last relation implies that e, = + 1 for all 
vy, and therefore u, = u*,. This completes the proof of our lemma. 

Returning to the problem of minimizing (13), let u, = y, — P(x,) be the 
discrepancies between the points (11) and the polynomial (12). These dis- 
crepancies are not independent variables, for they are obviously connected by 
the single linear relation 








Mo — Yo 1 xo... x0" " | 

“My — V1 1 x..." "| 
a 

| ttn — Yn 1 Xn--. Xn" | 

or 

Uo . xe | Yo RB. att 
uy iia iyi 1 a | 
unl... Xn" | Tee 


Since the coefficients of (u,) on the left-hand side of this linear relation alter- 
nate in sign, we obtain by (15) for the minimal value of (13) the explicit 
expression 


Yo 1x... x0" | 1 12%...x*" 

¥i.lxy...215 a —1 im... 
(17) p = abs. val. ae i, ars 

Re 1 Sass ote” (—1)* 1 5..." 


By (16) we also know that this minimal value p is reached for just one poly- 
nomial P(x) for which the discrepancies u,= y,— P(x,) are all equal in abso- 
lute value to p and alternate in sign. 

















TCHEBYCHEFF’S APPROXIMATION PROBLEM 253 


Concerning the analytical problem of best approximation of functions we 
wish to prove the following 

THeorem 4. Let f(x) be a real function defined in a&< x< B about which we 
only assume that it is bounded. Given n (n2 1), there exists a real polynomial 
P*(x), of degree not exceeding n — 1, which minimizes the expression 
(18) sup (|f(x) — P(x)|, 

eSxs8 

giving tt its minimal value 


p= sup |f(x) — P*(x)|. 


esxs8 
For this minimal value p we have the relation 
(19) p= sup p(xo,%1,..-,%n) for aL x <ai<... <xmK 8B, 
(xy) 
where 
| f(x0) xo... x0"| [ im..." 
| f(x1) Lay... x," | —1 Ixy... %\"" 
(20) p(xo,%1,...,Xn) =abs. val v , . | + . . 
eee | eoeee | « 
\f(xn) Lae. .-%n" | [(—1)" Lxp... x0"! 
In words, (19) means that the best approximation p of our function is the 
supremum of its best approximation p(xo, ...,X,) over sets of m + 1 distinct 


points of the range [a,{]. 
Proof. Since f(x) is bounded, so are the best approximations (20). Let 


(21) Po = sup p(Xo,X1,...,%n) for aX xo < x1 <... < ea B. 
(xy) 


In the space E, of the variables (ao,a,, .. . , @n—1) and corresponding to each 
value of x in the range [a,8], we consider the parallel layer of space D, defined 
by 

(22) Dz: \f(x) — aoe™™ — ayx™™? — ... — anal po. 

We claim that the collection {D,} of convex domains in E, satisfies both 
assumptions of Helly’s Theorem 2. Indeed, if a& & < f& <... < Ea& B, 
then A = D,, D,,,..., De, is evidently non-void and bounded; in fact A is a 
proper parallelepiped, except in the case pp = 0 when A reduces to a point. 
Let us now consider n + 1 distinct abscissae 

(23) aS % <%1 <<... < XaX B. 

If P(x) is the polynomial of best approximation to the points (x,, f(x,)) 
(vy = 0,...,), we have by (21) 

f(x.) — aox,"* — ... — dual = p(x0,%1,.--, Xn) < po (> o ©... oer n). 
Geometrically this means that the » + 1 convex domains D,,, D,,,..., D,, 
have the common point (do, ...,@n-1). By Helly’s theorem we conclude the 
existence of a point (a*o,..., @*,~1) which is common to ail the domains D,, 
hence there exists a polynomial P*(x) satisfying the inequality (22) for all 
x. For this polynomial P*(x) we therefore have 


(24) sup |f(x) — P*(x)|< po. 
esxs8 











254 RADEMACHER AND SCHOENBERG 


On the other hand, for an arbitrary polynomial P(x) we have 
, f(x) — P(x)2 sup \f(x») — P(x»)|2> p(xo,..., Xn), 

or 

Bn 2 f(x) — P(x)|2 p(xo,...,%n)- 


Taking the supremum of the right-hand side we find 


(25) sup |f(x) — P(x)|2 po, for every P(x), 
eszs86 

in particular also 

(26) sup |f(x) — P*(x)| 2 po. 
eSxs8 

Now by (24) and (26) we find that 

(27) sup [f(x) — P*(x)| = po. 
eSxs8 


From (25) and (27) we see po is the minimum value of (18), hence p = po, 
which is what we wanted to prove. 

REMARKS. 1. The polynomial P*(x), whose existence has just been proved, 
need not be unique. Thus, if » = 2 and 


(28) f(x) = [x] (OS x 1), 
then a graph will show that every polynomial of the family 
P(x) = ao(x — 1) +3 (OS aX 1), 


minimizes the expression (18), giving it its minimal value p = }. 

2. The existence part of Theorem 4 is also easily established directly by 
familiar continuity arguments; however, a proof of the relation (19), which 
seems to be new at least for a discontinuous f(x), would be difficult or at least 
involved without the use of Helly’s theorem which bridges most naturally the 
gap between the finite (algebraic) best approximation problem for n + 1 
points and the analytical problem for an interval [a,§]. 

3. Theorem 4 immediately generalizes to the case when the interval [a,é] 
of definition of f(x) is replaced by an arbitrary bounded point-set of the x-axis. 
A further possible extension of Theorem 3 is as follows: The inequality (22) 
means that the curve y = P(x) intersects the family of vertical segments 
f(x) — poS yv& f(x) + po, which are all of the same length 29. As in the 
Dresher-Harris theorem, this length could be required to vary with x. 


7. The classical case of continuous functions. We now add the important 
additional assumption that the function f(x) is continuous in the range [a,§]. 
Then it is clear that p(xo, ...,2xn), defined by (20), is a continuous function 
of (xo, ...,2n) as long as the inequalities (23) hold. We now extend the defi- 
nition of the function p(xo, ... , X,) throughout the closed domain 
(29) aX x < mK ...8 xa SB, 
by the convention that 
(30) p(Xo,%1,...,%n) = 0 if the x, are not all different. 


- 


~ 








TCHEBYCHEFF’S APPROXIMATION PROBLEM 255 


We claim that the extended function p(x, . . . , x,) is continuous throughout 
the closed domain (29). Indeed, let (xo, . . . x.) be a point of (29) for which at 
least two x,’s coalesce, hence p(xo,...,%,) = 0. Let this point (xp, . 


*? Xn) 
be the limit of a sequence of points (xo, .. . , x,°") with 


aX x <x") <...< 2x, B (k = 1,2,...). 
We have to show that 


(31) lim p(xo,..., 2%.) = 0. 
k-—-@ 
Clearly there exists a polynomial P(x), of degree n — 1 or less, so that 
f(x,) — P(x,) =0 (» = 0,...,%), 


hence by the continuity of f(x) and P(x) 
max |f(x,“) — P(x,™)| -0, ask. 


This, together with the inequality 
max |f(x,“) — P(x,“)|2 p(xo™,...,x0°"), 


implies (31). 

Let us now return to our Theorem 4 to note the effect of the continuity of 
f(x). Let us assume that the best approximation p is positive. The continuous 
function p(xo, ..., Xn) assumes its maximum value pata point (x*o,..., x*..) 
and because p is positive we must have 

ad x% < x81 <<... SxS 8B. 

We may now readily establish contact with the classical oscillation properties 
of the polynomial P*(x) of best approximation” p. In the first place P*(x) 
is now uniquely defined: Indeed, a polynomial P(x) of best approximation 
p = p(x*o,...,x*,) must satisfy the inequalities 
(32) [f(x*,) — P(x*)IX p = plx*s,...,2%) (» = 0,..., 0), 
while we know from the discussion of the case of m + 1 points that there is 
only one polynomial satisfying (32); since P*(x) does satisfy (32), P*(x) is 
uniquely defined. Secondly, we know that the sequence 

u*, = f(x*») — P*(x*,) (» = 0,...,%), 
has all its elements of absolute value equal to p and that they alternate in 
sign. These are the classical oscillation properties referred to above. Our 


example (28) shows that no such properties, beyond the general relation (19) 
hold in the case of discontinuous functions. 


AppENnpIx (Added July 1, 1949). The authors are much indebted to the 
referee for the following two valuable references; 

1. The theorem of Dines and McCoy of our Introduction is an immediate 
corollary of a theorem of E. Steinitz, Bedingt konvergente Reihen und konvexe 
Systeme I1, Journal fiir Mathematik, vol. 144 (1941), pp. 1-40. On pp. 12-13 


See [9], pp. 76-78. 





256 RADEMACHER AND SCHOENBERG 


Steinitz defines a family of rays in E,, with common initial point O, to 
be all-sided provided there are rays of the family on each side of every hyper- 
plane through O. An all-sided family is irreducible if no proper sub-family 
is all-sided. Steinitz then proves the following 


THEOREM. In any all-sided family of rays, there is contained at least one 
irreducible sub-family; such a sub-family has at least n + 1 and at most 2n rays. 

The Dines-McCoy theorem for the space E,, rather than E,,, follows thus: 
Let {H,} be the collection of half-spaces of the theorem and let R, denote 
the interior ray through O normal to the hyperplane bounding H». Suppose 
that these half-spaces have no common ray. Then for every ray p through O, 
for some v we must have Z2(R,,p) > 7/2; applying this remark to p and — a, 
we see that {R,} is an all-sided family of rays. By Steinitz’s theorem there 
is an all-sided sub-collection R;, Ro, ..., R,, say, with mn +1£ s< 2n. But 
then the corresponding Hi, H2 H, have no ray in common, in contra- 
diction to the assumption of the theorem. 

2. The Dresher-Harris theorem of our Introduction was fully discussed (for 
n = 1) by L. A. Santal6, Complementio a la nota: Un teorema sobre conjuntos de 
paralelepipedos de aristas paralelas, Publicaciones del Instituto de Matematica 
de la Universidad Nacional del Litoral, vol. 3 (1942), pp. 203-210. Also its 
proof by means of Helly’s theorem is found in footnote 4 on page 207 and 
attributed by Santalé to J. Rey Pastor. 


REFERENCES 


P. Alexandroff and H. Hopf, Topologie, vol. 1, Berlin, 1935. 

L. M. Blumenthal, Metric methods in linear inequalities, Duke Math. J., vol. 15 (1948), 
955-966. 

L. L. Dines and N. H. McCoy, On linear inequalities, Trans. Roy. Soc. Can., Third Series, 
Sec. III, vol. 27 (1933), 37-70. 

E. Helly, Uber Mengen konvexer Kérper mit gemeinschaftlichen Punkten, Jahresbericht 
der deutschen Mathematiker Vereinigung, vol. 32 (1923), 175-176. 

P. Kirchberger, Uber Tschebyschefsche Anndherungsmethoden, Math. Ann., vol. 57 (1903), 
509-540. The same paper appeared also in more elaborate form (96 pages) in 1902 
as a doctoral dissertation written under Hilbert’s guidance. 

D. Kénig, Uber konvexe Kérper, Math. Zeit., vol. 14 (1922), 208-210. 

J. Radon, Mengen konvexer Kérper, die einen gemeinsamen Punkt enthalten, Math. Ann., 
vol. 83 (1921), 113-115. 

A. Tarski, A decision method for elementary algebra and geometry, The Rand Corporation, 
1948, 57 pages. 

[9] Ch. J. de la Vallée Poussin, Lecgons sur l’'approximation des fonctions d'une variable réelle, 
Paris, 1919. 


The University of Pennsylvania 





